Open petercombs opened 5 years ago
After significant bugfixes after refactoring the plotting to its own module, I have the errorbars showing up (see commit c761c61). Not too surprisingly given the really low coverage I have at the moment, the errorbars are huge. But looking at some of those SNPs, I think they are about accurate.
But most of the counts have fewer than 10 reads per sample:
$ cat analysis/*/scores.tsv | body grep 4455300 | body bioawk -t '$9 >= 0' | column -t
snp_id pval stalk_ref stalk_alt spore_ref spore_alt stalk_ratio spore_ratio rank maxrank
DDB0232431:4455300_C|A 5.000000e-01 1 1 2 1 5.000000e-01 3.333333e-01 528 9459
DDB0232431:4455300_C|A 2.333333e-01 4 0 4 2 0.000000e+00 3.333333e-01 1702 29543
DDB0232431:4455300_C|A 5.000000e-02 1 0 0 9 0.000000e+00 1.000000e+00 47 13673
DDB0232431:4455300_C|A 1.142857e-01 3 0 6 6 0.000000e+00 5.000000e-01 459 77536
DDB0232431:4455300_C|A 2.197802e-02 6 0 4 5 0.000000e+00 5.555556e-01 81 64970
DDB0232431:4455300_C|A 5.000000e-01 0 1 1 11 1.000000e+00 9.166667e-01 868 4790
DDB0232431:4455300_C|A 5.000000e-01 1 1 1 0 5.000000e-01 0.000000e+00 96 471
DDB0232431:4455300_C|A 5.000000e-01 0 1 2 2 1.000000e+00 5.000000e-01 658 4209
DDB0232431:4455300_C|A 5.000000e-01 0 1 3 7 1.000000e+00 7.000000e-01 452 3458
DDB0232431:4455300_C|A 5.000000e-01 1 1 1 0 5.000000e-01 0.000000e+00 1188 4665
DDB0232431:4455300_C|A 1.250000e-01 1 0 0 3 0.000000e+00 1.000000e+00 212 3843
DDB0232431:4455300_C|A 5.000000e-01 1 0 3 2 0.000000e+00 4.000000e-01 503 2559
It occurs to me that we can put errorbars on these by assuming that they each one is binomially distributed, which should give some sense of the coverage per SNP.
Another option could be to draw partially transparent ellipses with the width and height corresponding to the estimates.