andersen-lab / ivar

iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing.
https://andersen-lab.github.io/ivar/html/
GNU General Public License v3.0
115 stars 39 forks source link

Variants returns multiple AA with 1.00 frequency #173

Open shoulton-invivyd opened 5 months ago

shoulton-invivyd commented 5 months ago

Describe the bug When using ivar variants, multiple AA at the same position return with frequency 1.0. After looking at the nucleotides, it looks like there are 2 nucleotide mutations in a row, and rather than returning one AA mutation that comes form those two nucleotide mutations, I get two AA mutations that both have a frequency of 1.

To Reproduce samtools mpileup -aa -A -d 0 -B -Q 0 --reference [<reference-fasta] | ivar variants -p [-r ] [-g GFF file]

Expected behavior For example, at nucleotide position 22895 I get a G -> C mutation, and at nucleotide position 22896 I get a T -> C mutation. This should give me a REF_CODON of GTT and an ALT_CODON of CCT at POS_AA 445. Instead, I get 2 Mutations at 445, one with ALT_CODON CTT and one with ALT_CODON GCT. This gave me an L at 445 and an A at 445, both with frequencies of 1, but I think it should be only P with a frequency of 1. (Screenshot included for easier viewing of this)

Screenshots (Apologies for the wonky screenshot, it was too wide for one screenshot) Example Position

Additional context Add any other context about the problem here.

cmaceves commented 5 months ago

Hey! Thanks for bringing this up, we're actually having internal discussions about this currently. While it's on the agenda, this might a fix for further in the future due to the way we call variants.

shoulton-invivyd commented 5 months ago

Great, thank you for the update!