Closed hsnguyen closed 2 years ago
+1
We find that position 21987 becomes problematic when constructing a tree of Delta variant samples. A mutation of G>A at 21987 is supposedly defining for the Delta variant, but those labs using the ARTIC v3 primers can often have the reference 'G' allele in consensus genomes, where as labs using other primer sets (e.g., the 'Midnight' 1200 base amplicons) call the variant 'A'. Our working theory is that either the artic minion
pipeline does not properly clip the artic 73_LEFT primer, or that there is a dropout region in artic amplicon 72 which causes the defining SNP to be lost.
Since the variant in question doesn't fall in the primer binding region of other popular primer sets, the same problem is not seen with those primers:
ARTIC v3
MN908947.3 21961 21990 nCoV-2019_73_LEFT
MN908947.3 22324 22346 nCoV-2019_73_RIGHT
ARTIC v4
MN908947.3 21865 21889 SARS-CoV-2_73_LEFT
MN908947.3 22247 22274 SARS-CoV-2_73_RIGHT
Midnight
MN908947.3 21532 21562 nCoV-2019_22_LEFT
MN908947.3 22590 22612 nCoV-2019_22_RIGHT
JS Eden
MN908947.3 21357 21386 nCoV-2019_11_LEFT
MN908947.3 23822 23847 nCoV-2019_11_RIGHT
Given the enduring popularity of the ARTIC v3 primers, it seems prudent to add position 21987 to the problematic sites list. If the position is left unmasked, we see artificial clustering in phylogenetic trees that has the potential to mislead phylogenetic or genomic epidemiological inference. We have seen this problem in Australia, but others have also seen the problem overseas. For example, the Pango team masks it in some trees: https://github.com/cov-lineages/pango-designation/issues/95
Thanks!
Hi @hsnguyen and @charlesfoster - thanks for letting us know, I've now added this site to the VCF(s) with a mask recommendation. I've listed you both as the submitters of this position, hope that's OK!
Great, thanks @conorwalker!
I am linking two issues from cov-lineages that seem relevant:
https://github.com/cov-lineages/pango-designation/issues/117#issue-925220359
https://github.com/cov-lineages/pango-designation/issues/134#issuecomment-879240791
Pointed out to me by @AngieHinrichs
This is from one of the defining site for delta (G142D) but ARTIC v3 seems to have some issues calling this SNP. It'd be great if you can look into this to see if we can add it to the VCF file. Thanks