cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

XBB.1.5 with Orf8:V62L in Europe (226 seqs) #1700

Closed xz-keg closed 1 year ago

xz-keg commented 1 year ago

Sub-lineage of :XBB.1.5 Earliest sequence: 2022.12.14(Germany) EPI_ISL_16330495 Most Recent sequence: 2023.2.20 (Germany) EPI_ISL_17006729

GISAID query: NS8_V62L,Spike_V83A, Spike_F486P, Spike_F490S,C5770T,C15279T No. of seqs: 145 (87 Germany, 2 Australia, 31 Austria, 21 Denmark, 1 France, 1 Japan, 1 Netherlands, 1 USA)

Mutations on top of XBB.1.5: C5770T,C15279T, G28077T(Orf8:V62L)

Genomes: EPI_ISL_16280585, EPI_ISL_16330495, EPI_ISL_16588786, EPI_ISL_16597588, EPI_ISL_16617435, EPI_ISL_16617520, EPI_ISL_16638056, EPI_ISL_16682234, EPI_ISL_16682236, EPI_ISL_16682240, EPI_ISL_16682242-16682243, EPI_ISL_16682285, EPI_ISL_16694725, EPI_ISL_16695840, EPI_ISL_16697144, EPI_ISL_16812004, EPI_ISL_16812128, EPI_ISL_16814427, EPI_ISL_16831706, EPI_ISL_16856627, EPI_ISL_16857399, EPI_ISL_16857836, EPI_ISL_16857951, EPI_ISL_16858184, EPI_ISL_16881463, EPI_ISL_16882192, EPI_ISL_16882300-16882301, EPI_ISL_16882449, EPI_ISL_16882496, EPI_ISL_16882508, EPI_ISL_16882858, EPI_ISL_16883818-16883819, EPI_ISL_16883872, EPI_ISL_16883890, EPI_ISL_16883901, EPI_ISL_16884703, EPI_ISL_16885246, EPI_ISL_16885625, EPI_ISL_16886234, EPI_ISL_16886434, EPI_ISL_16886452, EPI_ISL_16895231, EPI_ISL_16907901, EPI_ISL_16908244, EPI_ISL_16909940-16909941, EPI_ISL_16912337, EPI_ISL_16912425, EPI_ISL_16912461, EPI_ISL_16912484, EPI_ISL_16912520, EPI_ISL_16912532, EPI_ISL_16914159, EPI_ISL_16914164, EPI_ISL_16914188, EPI_ISL_16938779, EPI_ISL_16938844, EPI_ISL_16938966, EPI_ISL_16939014, EPI_ISL_16939945, EPI_ISL_16940445, EPI_ISL_16940932, EPI_ISL_16940976, EPI_ISL_16941055, EPI_ISL_16941095, EPI_ISL_16941228, EPI_ISL_16941252, EPI_ISL_16941273, EPI_ISL_16941414, EPI_ISL_16941456, EPI_ISL_16941516, EPI_ISL_16941566, EPI_ISL_16947389, EPI_ISL_16947449, EPI_ISL_16951402, EPI_ISL_16970415, EPI_ISL_16976905, EPI_ISL_16977044, EPI_ISL_16977048, EPI_ISL_16985172, EPI_ISL_16985303, EPI_ISL_16985351, EPI_ISL_16985354, EPI_ISL_16985401, EPI_ISL_16985427, EPI_ISL_17002259, EPI_ISL_17002282, EPI_ISL_17002800, EPI_ISL_17003185, EPI_ISL_17003426, EPI_ISL_17003432, EPI_ISL_17003436, EPI_ISL_17004085, EPI_ISL_17004157, EPI_ISL_17004565, EPI_ISL_17004605, EPI_ISL_17004624, EPI_ISL_17004626, EPI_ISL_17004629, EPI_ISL_17005733, EPI_ISL_17005756, EPI_ISL_17005815, EPI_ISL_17005845, EPI_ISL_17005881, EPI_ISL_17005933, EPI_ISL_17006040, EPI_ISL_17006047, EPI_ISL_17006271, EPI_ISL_17006295, EPI_ISL_17006469, EPI_ISL_17006477, EPI_ISL_17006480, EPI_ISL_17006486, EPI_ISL_17006596, EPI_ISL_17006605, EPI_ISL_17006646, EPI_ISL_17006701, EPI_ISL_17006729, EPI_ISL_17006750, EPI_ISL_17006756-17006757, EPI_ISL_17006765, EPI_ISL_17006770, EPI_ISL_17006773, EPI_ISL_17006829, EPI_ISL_17006840, EPI_ISL_17007656, EPI_ISL_17007695, EPI_ISL_17007712, EPI_ISL_17007750, EPI_ISL_17017829, EPI_ISL_17017839, EPI_ISL_17017896, EPI_ISL_17017940, EPI_ISL_17018001, EPI_ISL_17018098, EPI_ISL_17018153, EPI_ISL_17018317, EPI_ISL_17019852, EPI_ISL_17023740, EPI_ISL_17025328, EPI_ISL_17029706

Screen Shot 2023-02-27 at 15 47 06

cov-spectrum comparison

It is showing very high growth advantage against XBB.1.5. Real growth advantage may not be that high but is likely to have some.

Screen Shot 2023-02-27 at 15 46 47

usher

Orf8:V62L is somehow analysed in papers back from 2021, showing it may cause immune escape. Mutations in SARS-CoV-2 ORF8 Altered the Bonding Network With Interferon Regulatory Factor 3 to Evade Host Immune System

And this mutation is happening in many other lineages. B.1.1.624 #91 Q.6 #183 BA.5.5.1 #873 BQ.1.1.60 #1426

xz-keg commented 1 year ago

161 seqs now.

xz-keg commented 1 year ago
Screen Shot 2023-03-07 at 02 01 28

This lineage seems to be the fastest among all XBB.1.5 sublineages. Only slower than XBB.1.9 XBB.1.11 XBB.1.16 and XBL.

source

xz-keg commented 1 year ago

226 seqs now, still the fastest among XBB.1.5 sublineages.

FedeGueli commented 1 year ago

@aviczhl2 update the title when new sequences come in please!

And add your new very useful collection link to easily check its growth: https://cov-spectrum.org/collections/155

(Spike only sequences are missassigned as XBB.1.11.1 you have to add C13968A, G16377A to the query to see real XBB.1.11.1)

FedeGueli commented 1 year ago

264 on Gisaid 267 on Usher: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_38ca8_9025c0.json?c=country&label=id:node_7716551

Schermata 2023-03-08 alle 22 54 24

It is still mainly from Germany and not always the way Germany uploads samples is good to estimate growth advantages.

NkRMnZr commented 1 year ago

Still fascinating how that mut will still show growth adv after that orf8:G8* stop codon, thought anything after that would be redundant but it seems not. Is it a 'special feature' to, let's say, XBB.1.5 only or orf8 gernally, or has any kind of things happened before? Sorry if that's a silly question.

FedeGueli commented 1 year ago

it is not silly at all. @ryhisner did you look at orf8 mutations coming after the stop codon? do they create something interesting? @thomasppeacock

thomasppeacock commented 1 year ago

If the ORF8 mutations are right at the end (ie ~aa120 region) they could influencing the efficiency of the downstream N TRS-B sequence through extended homology. Not sure about in other regions though

xz-keg commented 1 year ago

Still fascinating how that mut will still show growth adv after that orf8:G8* stop codon, thought anything after that would be redundant but it seems not. Is it a 'special feature' to, let's say, XBB.1.5 only or orf8 gernally, or has any kind of things happened before? Sorry if that's a silly question.

Yes, this is very mysterious. Some more mysterious thing is that it seems some synonymous mutations can also have growth advantage, like C20703T for BF.7.14 in China or BF.7.4 in Japan.

xz-keg commented 1 year ago

It is still mainly from Germany and not always the way Germany uploads samples is good to estimate growth advantages.

There is also chance that this is a statistical issue like Simpson paradox, that this does not have real growth advantage but just happen to :

1: a "transition" from an under-sampled region to a region with high sample frequency.

2: Sudden change of sample frequency in some region. (If a country samples 10 on Jan. while samples 10,000 on Feb., all viruses circulating in that country will automatically have an additional 1000X growth advantage )

3: "Carriage Effect", if XBB.1.5 is growing more quickly in Germany than in other parts of the world, then all its sub-branches will have a global growth advantage due to being 'carried' by the quick growth of the main-branch. (I think under this situation, main lineages of that country shall also be designated, as this may also apply to different regions within the same country, or different communities within the same region and under the latter two circumstances we absolutely designate those lineages)

The difference between 2 and 3 is that 3 shows real viral infection, while 2 is simply a heuristic statistical effect.

Did Germany, or any region inside Germany, suddenly changes its sample frequency on some time point?

corneliusroemer commented 1 year ago

This is a good demonstration of the need to be careful with growth estimates. Germany uploads faster than rest of world -> hence there will be up-bias. We've seen this many times with Denmark in particular. Whatever is common in Denmark gets a fake growth advantage when looking at global level.

Once you select Germany alone, there's nothing.

image

For Germany, there's more data in "open" covSpectrum, same result there:

image

Please bear this in mind in the future and be careful with growth estimates. As expected, mutations after the stop codon don't matter (as much). They effectively become synonymous mutations, which can have effects but they are usually much smaller. It's quite funny to read all of you here speculating on why this might be the case - motivated reasoning ;)

https://open.cov-spectrum.org/explore/Germany/AllSamples/Past2M/variants?nextcladePangoLineage=XBB.1.5*&aaMutations1=orf8%3AV62L&nucMutations1=C5770T%2CC15279T&nextcladePangoLineage1=XBB.1.5*&analysisMode=CompareToBaseline&

corneliusroemer commented 1 year ago

This is a branch just like any other, maybe common to Germany but that's it. Hence should not be designated unless there are other reasons than discussed previously. I'll also change the title to delete the growth advantage as it's misleading.

corneliusroemer commented 1 year ago

I'll close this for now as it's nothing special - maybe it will get designated as part of a systematic XBB.1.5 sublineage designatathon

xz-keg commented 1 year ago

This is a good demonstration of the need to be careful with growth estimates. Germany uploads faster than rest of world -> hence there will be up-bias. We've seen this many times with Denmark in particular. Whatever is common in Denmark gets a fake growth advantage when looking at global level.

Once you select Germany alone, there's nothing.

image

For Germany, there's more data in "open" covSpectrum, same result there: image

Please bear this in mind in the future and be careful with growth estimates. As expected, mutations after the stop codon don't matter (as much). They effectively become synonymous mutations, which can have effects but they are usually much smaller. It's quite funny to read all of you here speculating on why this might be the case - motivated reasoning ;)

https://open.cov-spectrum.org/explore/Germany/AllSamples/Past2M/variants?nextcladePangoLineage=XBB.1.5*&aaMutations1=orf8%3AV62L&nucMutations1=C5770T%2CC15279T&nextcladePangoLineage1=XBB.1.5*&analysisMode=CompareToBaseline&

Haha.

As this is a known statistic effect, do we have tools to regulate this? We already know how the statistics may influence growth advantages. There should be some ways to get rid of the statistical bias. As the bias may also happen in other lineages that are not so synonymous.

For example, setting a "upper threshold" for each epi week according to the current seqs/population situation, and downsample everything from regions that uploads seqs above that threshold, and calculate a refined proportion and estimate a refined growth advantage based on that proportion.