Closed c19850727 closed 2 years ago
Can I suggest we make a parent sub-lineage with ORF1a:261N that catches the 2-3 proposals made by you @c19850727 in that branch? Otherwise we're screwing up the hierarchy and it looks like these related lineages are unrelated.
Let's call this lineage AY.N
Here you can see the ORF1a:261N branch on a global Neherlab build: https://nextstrain.org/groups/neherlab/ncov/global?branchLabel=aa&c=gt-nuc_1048,27527,1758,2560&m=div&p=grid&r=division
I've colored by your proposed lineages.
I'm not sure whether the S:680P one should already be designated a sublineage of AY.N.3 as AY.N.3.1, one could do it I guess, if you're keen. Or wait.
Here's a zoom view of AY.N.3:
PS @c19850727 general suggestion: the growth advantage is sometimes useful but if you post so many screenshots your issues get too long to be easily reviewed. Same applies for the outbreak.info country screenshot. For my feeling there's too many high images in your issues. A bit more structure and few screenshots could be good :)
Up to 1/3rd of sequences from Russia have already lost ORF7a:P45L, reverting to P (the uppermost branch on the screenshot, which is filtered to Country -> Russia). Nt 1758C is still the same. In this week's uploads from 4 regions of Russia, it 1758C accounts for 85-95% of the sequences (and it's similar with the most recent batches from Ukraine and Kazakhstan), but the reverted subbranch has up to 30% incidence. I understand that the sub-branches have many pending issues, but wouldn't it, at the very last, make sense to designated the higher-level branch, given how predominant it is in several large countries?
The reversion is in all likelihood a tree building error not biological. Actual reversions are much much rarer than tree building or sequencing artefacts. 1758C is part of Delta.
The reversion is in all likelihood a tree building error not biological. Actual reversions are much much rarer than tree building or sequencing artefacts.
There is a good reason to suspect a reversion because both sub-branches share NSP2:K81N which otherwise seems to be fairly rare, and has been mentioned in the pango issues just once, and both share geographies and get sequenced in the same labs. But of course it is possible that a sequencing artifact affects it selectively, possibly in a way related to sample uptake and preservation like with the recently-famous S:Y145H where the "proper" ARTIC primer fails, but the higher-quality RNAs still get some sequences from the next primer over...
Outside of Russia, say, in the UK, it is the same pattern where most ORS7a:P45L sequences also have NSP2:K81N, but some NSP2:K81N genomes have P rather than L.
This looks like totally normal tree building issues, nothing else
https://nextstrain.org/groups/neherlab/ncov/russia?branchLabel=aa&c=gt-nuc_1048,27527&m=div
normal tree building issues, nothing else
would it help with tree-building uncertainties if this substrain is designated? It gets a little bewildering now. In the new uploads from Ukraine, 7 genomes with NSP2:K81N + ORF7a:P45L are assigned to 4 different Pango lineages...
I suggest the designation of the parental lineage with its sublineages, Estonia is actually the worst hit country in Europe and sublineage 3 proposed here is about 1/4 of the sequences there.
@chrisruis @corneliusroemer
Exactly the same lineage ambiguity is observed in today's upload from Russia. 40/42 samples collected in mid-October have NSP2:K81N + ORF7a:P45L, but they are assigned to 4 different lineages. Mostly AY.43 and B.1.617.2, but neither accounts for even half of the total...
Designated as AY.122 in #320
Hi @c19850727 here the second proposed sublineage of the newly designated AY.122 is still very interesting and should be proposed in a dedicated issue.
It is still growing in Denmark where it arrives at 3,3% of the last twenty days sequences
Core mutation: orf1a:3483F, Orf7a:R118G, Orf1a:A498V First sequences from Denmark: week 36 Total Sequences:1304 (1274 in Denmark)
NexstNeher Lab Tree
@corneliusroemer
Description Sub-lineage of: B.1.617.2, Clade 21J (~56K sequences, including a ~10K sub-clade where ORF7a:45L mutates back to 45P) Earliest sequence: 2021/4/15 (the UK) Most recent sequence: 2021/10/14 (Italy) Countries circulating: wide-spread, with higher prevalence in Russia and Eastern Europe
Cumulative prevalence and number of samples sequenced as per Outbreak.info: https://outbreak.info/situation-reports?pango&muts=S%3AP681R&muts=ORF7a%3AP45L&muts=ORF1a%3AK261N
Mutations in addition to B.1.617.2, Clade 21J: ORF1a:K261N then ORF7a:P45L
Genomes: Delta P45L.csv
Evidence Downsiezed tree as per NeherLab (shown in yellow color): https://nextstrain.org/groups/neherlab/ncov/europe?c=gt-ORF7a_45&f_clade_membership=21J%20%28Delta%29&label=spike_mutations:I95T&p=grid&r=division
Transmission advantage as per CoV-Spectrum: https://cov-spectrum.ethz.ch/explore/World/AllSamples/AllTimes/variants/json=%7B%22variant%22%3A%7B%22mutations%22%3A[%22ORF1a%3AK261N%22%2C%22ORF7a%3AP45%22]%7D%2C%22matchPercentage%22%3A1%7D
Interestingly, big chunk of its sub-branch has the ORF7a:45L mutated back to 45P.
Sub-lineage of: B.1.617.2, Clade 21J (~9.2K sequences) Earliest sequence: 2021/4/15 (the UK) Most recent sequence: 2021/10/12 (Denmark) Countries circulating: wide-spread, with higher prevalence in France, Italy and Monaco.
Cumulative prevalence and number of samples sequenced as per Outbreak.info: https://outbreak.info/situation-reports?pango&muts=S%3AP681R&muts=ORF7a%3AR118G&muts=ORF1a%3AK261N
Mutations in addition to B.1.617.2, Clade 21J: ORF1a:K261N then ORF7a:P45L then ORF7a:L45P, ORF7a:R118G and ORF1a:A498V
genomes: Delta R118G.csv
Evidence Downsized tree as per NeherLab (shown in light green): https://nextstrain.org/groups/neherlab/ncov/europe?c=gt-ORF7a_45,118&f_clade_membership=21J%20%28Delta%29&label=spike_mutations:I95T&p=grid&r=division
Transmission advantage as per CoV-Spectrum: https://cov-spectrum.ethz.ch/explore/World/AllSamples/AllTimes/variants/json=%7B%22variant%22%3A%7B%22mutations%22%3A[%22ORF1a%3AK261N%22%2C%22ORF7a%3AR118G%22%2C%22ORF1a%3AA498V%22]%7D%2C%22matchPercentage%22%3A1%7D
There is another much smaller sub-branch:
Sub-lineage of: B.1.617.2, Clade 21J (~913 sequences) Earliest sequence: 2021/4/23 (Russia) Most recent sequence: 2021/10/10 (Denmark) Countries circulating: Mainly Estonia (~16% prevalence), and a few other European countries
Cumulative prevalence and number of samples sequenced as per Outbreak.info: https://outbreak.info/situation-reports?pango&muts=S%3AP681R&muts=S%3AS680P&muts=ORF1a%3AK261N
Mutations in addition to B.1.617.2, Clade 21J: ORF1a:K261N then ORF7a:P45L then S:S680P more than half of the sequences from Estonia have further substitutions of ORF1a:K851E, ORF1a:A903V, ORF1b:D815Y, ORF3a:S40P and nuc C6781T and (524 sequences, shown in blue color)
Genomes: Delta P45L+S680P.csv
Evidence Downsized tree as per NeherLab (shown in blue color): https://nextstrain.org/groups/neherlab/ncov/estonia?c=gt-S_680&f_clade_membership=21J%20%28Delta%29&p=grid&r=division
Transmission advantage as per CoV-Spectrum: https://cov-spectrum.ethz.ch/explore/World/AllSamples/AllTimes/variants/json=%7B%22variant%22%3A%7B%22mutations%22%3A[%22ORF1a%3AK261N%22%2C%22S%3AS680P%22]%7D%2C%22matchPercentage%22%3A1%7D