Closed akifoss closed 1 year ago
Caution... I've noticed that almost exclusively in Germany/RKI sequences, A11537G (ORF1a:I3758V) and T11524C occur together really frequently, but in sequences whose other mutations map them to many different branches of BA.2 and BA.5. In other words, many different BA.2 and BA.5 lineages (and undesignated branches of BA.2) include a cluster of German sequences with A11537G and T11524C. I wonder if there's some kind of systematic error behind that.
To illustrate a few of the many places I see this in the tree, here are Taxonium views, colored by country (darker green = Germany, bright green = Netherlands, purple = France), with blue circles around nodes with a change at 11537 (min 10 samples) and green circles around nodes with a change at 11524 (min 10 samples):
BA.5.3.2 (the second green circle is a reversion on 11524):
BE.1.1 (BA.5.3.1.1.1):
BA.5.1:
BA.5.2.1:
BA.2 + C25416T + C8092T + C2062T + C29420T:
BA.2 + C22792T + G3692T + A25927G + C952T:
... and many more, you get the idea.
I'd be tempted to mask those two positions except A11537G is a BA.1 mutation and matters when detecting BA.1 / BA.2 recombinants, and some of those A11537G's could certainly be real. I just think it's suspicious that A11537G and T11524C appear together in so many different branches, almost always from the same country (although TBF Germany does seem to have contributed an outsized share of BA.2 and BA.5 sequences, especially since UK testing dropped off). Here's a zoomed-out country-colored view of BA.2 (and nested BA.5):
@AngieHinrichs i think ORF1a:I3758V is a known sequencing issue in Germany, if i dont recall badly @josetteshoenma digged a bit on it cause it appeared also in Netherlands and they solved it.
@AngieHinrichs i think ORF1a:I3758V is a known sequencing issue in Germany, if i dont recall badly @josetteshoenma digged a bit on it cause it appeared also in Netherlands and they solved it.
Wow, that's great, @JosetteSchoenma maybe you can help the RKI folks solve it too! 🙂
@AngieHinrichs @CoolenJordy actually fixed this, after we discussed it. He mentions it in his GitHub issue here. https://github.com/JordyCoolen/easyseq_covid19/releases/tag/v0.9
Are you maybe in contact with the RKI?
Most Dutch sequences are from Microvida. I will try and contact somebody from that lab. Note for me: ORF1a:I3758V = NSP6_I189V on GISAID.
Ah, thanks @JosetteSchoenma (and @JordyCoolen)! Looks like this commit changed a primer trimming region to start at 11520 instead of 11525, which would affect position 11524 too I suppose: https://github.com/JordyCoolen/easyseq_covid19/commit/e2412313ddaaf39a0b8014e4209d01c32a4d3245 Please do pass that on to anyone you know at RKI and I'll see if I can find contact info for someone there too! Thanks!
Thanks to all for the input!
The RKI is in contact with the labs, see https://github.com/robert-koch-institut/SARS-CoV-2-Sequenzdaten_aus_Deutschland/issues/27#issuecomment-1201352032
We were able to solve that with https://github.com/JordyCoolen/easyseq_covid19/commit/e2412313ddaaf39a0b8014e4209d01c32a4d3245!
Sub-lineage of: Potential new sublineage of BA.5 Earliest sequence: EPI_ISL_13351624 (06.04.2022, Indonesia) Most recent sequences: EPI_ISL_13662398, EPI_ISL_13662402, EPI_ISL_13662397, EPI_ISL_13662399, EPI_ISL_13662401 (25.06.2022, Netherlands) Defining mutation: ORF1a:I3758V
Countries circulating 393 Germany 84 Netherlands 51 France 3 Denmark 1 South Africa 1 Singapore 1 Italy 1 Indonesia 1 Eswatini 1 Belgium
Sublineage composition 208 BA.5.1 142 BE.1 46 BA.5.2.1 43 BA.5.3.2 34 BA.5.3 29 BA.5 20 BA.5.2 7 BF.1 4 BA.5.5 4 BA.5.3.1
Genomes
BA5sublin-GISAID-ORF1a_I3758V-2022-07-13.txt
Evidence
Genomes with this mutation are sitting on a long branch in a one-time constructed BA.5* phylogeny (top left corner):
https://cov-spectrum.org/explore/Europe/AllSamples/Past6M/variants?aaMutations=ORF1a%3AI3758V&pangoLineage=BA.5*&aaMutations1=ORF1a%3AI3758V&pangoLineage1=BA.5*&