Closed Alema91 closed 2 years ago
We have observed that ivar variants
can generate false positive variant calls for SARS-CoV-2 genomes that contain insertions or deletions. Here is an example from a private genome that contains the 6bp ORF8 deletion:
CHROM | POS | REF | ALT | GENE | EFFECT | HGVS_C | HGVS_P | DP | REF_DP | ALT_DP | AF | sample | software | lineage |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NC_045512.2 | 28247 | AGATTTC | A | ORF8 | conservative_inframe_deletion | c.355_360delGATTTC | p.Asp119_Phe120del | 76166 | 48839 | 64275 | 0.84 | 218025 | ivar | AY.33 |
Position 28247 has a well supported deletion (nearly 70000x coverage) and ivar variants
is calling a variant inside that deletion (IGV image and variant table):
CHROM | POS | REF | ALT | GENE | EFFECT | HGVS_C | HGVS_P | DP | REF_DP | ALT_DP | AF | sample | software | lineage |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NC_045512.2 | 28253 | C | A | ORF8 | missense_variant | c.360C>A | p.Phe120Leu | 3954 | 61 | 3851 | 0.97 | 218025 | ivar | AY.33 |
This variant should be included in the consensus according to our quality criteria (variants with an AF > 0.75). Therefore, the AF is overestimated due to the misscalculation in the variant depth (because of the deletion).
CHROM | POS | REF | ALT | GENE | EFFECT | HGVS_C | HGVS_P | DP | REF_DP | ALT_DP | AF | sample | software | lineage |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
NC_045512.2 | 28247 | AGATTTC | A | ORF8 | conservative_inframe_deletion | c.355_360delGATTTC | p.Asp119_Phe120del | 75039 | 10741 | 63575 | 0.847 | 218025 | VarScan | AY.33 |
NC_045512.2 | 28248 | GATTTCA | G | ORF8 | disruptive_inframe_deletion | c.356_361delATTTCA | p.Asp119_Ile121delinsVal | 64673 | 126 | 189 | 0.003 | 218025 | VarScan | AY.33 |
NC_045512.2 | 28249 | ATTTC | A | ORF8 | frameshift_variant | c.357_360delTTTC | p.Asp119fs | 64614 | 98 | 79 | 0.001 | 218025 | VarScan | AY.33 |
NC_045512.2 | 28251 | TTCATC | T | ORF8 | frameshift_variant&stop_lost&splice_region_variant | c.360_364delCATCT | p.Phe120fs | 69573 | 5295 | 48 | 0.001 | 218025 | VarScan | AY.33 |
NC_045512.2 | 28252 | TC | T | ORF8 | frameshift_variant | c.360delC | p.Phe120fs | 69316 | 4895 | 151 | 0.002 | 218025 | VarScan | AY.33 |
NC_045512.2 | 28253 | C | A | ORF8 | missense_variant | c.360C>A | p.Phe120Leu | 69561 | 15 | 5009 | 0.072 | 218025 | VarScan | AY.33 |
Other variant callers such us Varscan detect this variant with an AF << 0.25 because the depth of that position is calculated taking into account the deletion reads. Thus, the AF differs from ivar variants.
We might suggest that ivar variants
overestimate this variant based on the depth calculation and therefore can cause issues with variant prediction in indels (insertions and deletions).
This issue may be related to #79, #83, #85, #103
Run ivar variants with these params:
samtools mpileup \\
-a \\
--count-orphans \\
--no-BAQ \\
--ignore-overlaps \\
--max-depth 20 \\
--fasta-ref fasta \\
--min-BQ | ivar variants -q 30 -t 0.25 -m 10 -r fasta gff -p sample
Better comprehension and need data to reproduce the issue.
Create a issue to examine the ivar performance in indels