andersen-lab / Freyja

Depth-weighted De-Mixing
BSD 2-Clause "Simplified" License
102 stars 29 forks source link

Compute and write covariant frequencies #159

Closed wutron closed 1 year ago

wutron commented 1 year ago

Let count be the number of reads with a set of covariants, and max_count be the number of reads that could contain that set of covariants (i.e. reads that span positions in set of covariants). Then freq = count / max_count.

Other edits:

wutron commented 1 year ago

Provides "covariant frequency", similar to variant frequency, when using freyja covariants. Happy to answer questions and discuss.

dylanpilz commented 1 year ago

Hey @wutron,

Thanks for making these changes! I agree that it's much more useful to calculate frequency on a per-site basis as opposed to the abundance among all mapped reads in the sample, especially when running this analysis on amplicon sequencing. I'll do some further integration testing on my end and get back to you with any questions or feedback.

-Dylan

wutron commented 1 year ago

Sounds good. Let me know if you want me to handle linting or tests. (With the caveat that it might be easier on your end since the checks do not always auto-trigger for me on a push.)