Open shoulton-invivyd opened 5 months ago
Hey! Thanks for bringing this up, we're actually having internal discussions about this currently. While it's on the agenda, this might a fix for further in the future due to the way we call variants.
Great, thank you for the update!
Describe the bug When using ivar variants, multiple AA at the same position return with frequency 1.0. After looking at the nucleotides, it looks like there are 2 nucleotide mutations in a row, and rather than returning one AA mutation that comes form those two nucleotide mutations, I get two AA mutations that both have a frequency of 1.
To Reproduce samtools mpileup -aa -A -d 0 -B -Q 0 --reference [<reference-fasta] | ivar variants -p [-r ] [-g GFF file]
Expected behavior For example, at nucleotide position 22895 I get a G -> C mutation, and at nucleotide position 22896 I get a T -> C mutation. This should give me a REF_CODON of GTT and an ALT_CODON of CCT at POS_AA 445. Instead, I get 2 Mutations at 445, one with ALT_CODON CTT and one with ALT_CODON GCT. This gave me an L at 445 and an A at 445, both with frequencies of 1, but I think it should be only P with a frequency of 1. (Screenshot included for easier viewing of this)
Screenshots (Apologies for the wonky screenshot, it was too wide for one screenshot)![Example Position](https://github.com/andersen-lab/ivar/assets/144721442/ff5b9aef-6b4c-4e5f-a945-bcc5c60894e8)
Additional context Add any other context about the problem here.