sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
43 stars 2 forks source link

BA.2.15 - 67 nuc /31 AA spike saltation - singlet, South Africa collected 2023-11-29 #1352

Closed BorisUitham closed 5 months ago

BorisUitham commented 8 months ago

Nucleotide search: G28028C, G28395A, G28509A (alternative query : C6285T, C4596A,T17088C edited) EPI_ISL_18839074

nuc unique T3730C, C4596A, A7429G, T16002C, T17088C, G17706A, T21015C, G21606T, C21627A, A21788C, G21848C, G21969T, T22020C, T22566C, T22662A, G23258A, G23264A, C23277G, G23587C, A23598G, G23856T, C24503A, G24872C, G25534A, G25593C, C26313A, G26314A, A27353T, C27509T, A27892G, G28028C, G28395A, G28509A, C28957T, G29072T nuc homoplasies C6285T, C6762T, C6936T, G15451A, C15720T, G17122T, T18042C, A19476G, C19554T, C20719T, C21762T, G21795T, C22000T, A22001G, C22050T, A22115T, G22200A, C22295A, G22332A, G22599C, C22624A, C22664A, G22770A, G22865T, G23222A, C23423T, C25413T, C26894T, C27476T, C27577T, G29151A, C29627T

ORF1a: unique T1444N

ORF1a T2007I, T2166I, S2224F, ORF1b: G662S , A1219S (Edited) S unique C15F, T22N, T76P, E96Q, C136F, M153T, L335S, V367D, G566S, D568N, T572S, K679R, R765L, L981I S homoplasies A67V, R78M, K147E, A163V, N185Y, G213E, H245N, G257D, R346T, N354K, L368I, R403K, A435S, E554K, P621S, Q675H, V1104L

ORF3a unique V48I, K67N

E unique F23L, V24M

ORF6 unique Q51L

ORF7a unique T39I

ORF7a homoplasies T28I, Q62*

ORF8 unique W45C Orf9b: G38S,V76I (edited)

N unique R41Q, S79N, A267S N homoplasies R293K

ORF10 homoplasies R24C

Inheriting from BA.2.15: C16877T, G21753T (S:W64L) C27944T. Furthermore, usher places it on a branch which may or not be an artefact with G23055A and C23075T which result in the spike reversions of H505Y and R498Q. The C15 - C136 disulfide bridge is lost by both sites mutating to F.

image https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_13756_8e6ef0.json?label=id:node_4836261

Nextclade (edited)

Screenshot 2024-01-30 alle 14 59 22

Large deletion in NSP1 (very recurrent indel region) (edited):

Screenshot 2024-01-30 alle 15 11 09
BorisUitham commented 8 months ago

Notably, according to nextclade there is a lack of coverage between RBD residues 443-503. Who knows how many and what kind of mutations lie there.

FedeGueli commented 8 months ago

Little addendum this vairant has two Orf9b mutations : Orf9b:G38S, Orf9b:V76I cc @ryhisner

FedeGueli commented 8 months ago

@shay671 please take a look. @angiehinrichs @thomasppeacock

FedeGueli commented 8 months ago

@silcn

FedeGueli commented 8 months ago

notably it has ORF1b: G662S (BA.2.75, XBB, Delta)

FedeGueli commented 8 months ago

ORF1b:A1219S was the hallmark of AY.100 one of the north american most fit Deltas

aviczhl2 commented 8 months ago

del509_523
Orf1a: del82-86

ryhisner commented 8 months ago

I thought for sure that the 2-nuc S:V367H was unique, but it turns out there was an AY.37 branch in western Europe with it in late 2021, which also had the very rare S:Y508H. Just 15 sequences though, and it did not have S:L368I like this one.

image
ryhisner commented 8 months ago

Combined with the S:∆24-27 deletion in all BA.2, S:T22N adds a glycan to replace the one lost with S:T19I.

image
silcn commented 8 months ago

As well as the already-noted long stretch of NNNs, there are a number of single-nucleotide Ns in the RBD. S:408 AGN (AGA in WT, AGT in BA.2) S:415 NCT (ACT in WT and BA.2) S:418 NTT (ATT in WT and BA.2) S:429 TTN (TTT in WT and BA.2) S:431 GGN (GGC in WT and BA.2) S:432 NGN (TGC in WT and BA.2) S:434 NTN (ATA in WT and BA.2) - this one is adjacent to G22865T = S:A435S S:438 NCT (TCT in WT and BA.2) S:439 ANC (AAC in WT and BA.2) S:440 AAN (AAT in WT, AAG in BA.2) S:441 CTN (CTT in WT and BA.2) S:443 NCT (TCT in WT and BA.2)

I would guess most of these are not hiding mutations, but with everything else going on here, who knows?

silcn commented 8 months ago

Usher tree placement seems uncertain. It is currently on a big branch with A13533G followed by C27944T. This sequence has C27944T but not A13533G, and C27944T is homoplasic enough that I'd lean towards independent acquisition of 27944T being more likely than reversion of 13533G. But I can't see any obvious placement if we make that assumption.

silcn commented 8 months ago

A27892G de-optimises the ORF8 TRS edit: for whatever reason this one isn't very common - largest lineage containing it was a substantial branch of XB (yes, XB not XBB, all the way back in 2021)

ryhisner commented 8 months ago

Decent number of reversions to the SARS-1/Bat-CoV residues in this one, as with BA.2.86. These are on top of the ones normal BA.2 has, which include S:A27S (T in S1/BC), S:G339D, S:T478K, and S:D796Y. I assume it also has S:N460K, but the dropout makes that hard to confirm.

I still don't know why such reversions are seen in so many chronic-infection sequences/lineages, but there's no question the tendency exists.

image
ryhisner commented 8 months ago

A27892G de-optimises the ORF8 TRS

I would guess that pretty much kills ORF8 expression. C27889T nearly zeroed out ORF8 expression, and I bet this one does the same.

JustinS6626 commented 8 months ago

If ORF8 expression is gone, what does that translate into, functionally speaking?

FedeGueli commented 8 months ago

Thx @silcn and @ryhisner

ryhisner commented 8 months ago

If ORF8 expression is gone, what does that translate into, functionally speaking?

Possibly higher spike expression/infectivity. A million functions have been attributed to ORF8, and I think most of them are probably nonsense, but there was a paper that made a pretty convincing case that ORF8 reduces spike cell surface expression. https://www.jbc.org/article/S0021-9258(23)01983-X/fulltext And there appears to me to be genetic evidence for this as well, especially in the way XBB.2.3 (which has an intact ORF8) had very few RBD mutations, while XBB.1 lineages (with ORF8:G8*) added them quite frequently. I think oobb was the first to put this hypothesis forward.

ujoshi4 commented 8 months ago

The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.

drmutaba commented 8 months ago

The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.

Not necessarily. Surveillance has been phased down extensively worldwide, with many infected not getting tested at all and when they are tested, only few get sequenced.

But sure, the fact that the sample is almost two months old gives hope that this one just fizzled out.

ujoshi4 commented 8 months ago

The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.

Not necessarily. Surveillance has been phased down extensively worldwide, with many infected not getting tested at all and when they are tested, only few get sequenced.

But sure, the fact that the sample is almost two months old gives hope that this one just fizzled out.

If it was from that long ago, I am almost sure that it has fizzled out. Hyping this up is not to be advised. It is highly unlikely that a variant would have the exact same growth advantage as JN.1 causing it to spread that slowly. Usually, these variants fizzle out quick or spread to other nations fast. However, there was a cryptic ba.1.1 decendent in the netherlands that survived for a few months in early to middle 2023. Here is a spike comparation against JN.1 in cov-spectrum: image

ujoshi4 commented 8 months ago

Cryptic lineages can persist. It survived in the Netherlands for months in early and mid 2023. If it had picked up the right mutations, it could have outcompeted XBB. But unfortunately, BA.2.86 was actually able to pull it off. image

shay671 commented 8 months ago

@shay671 please take a look. @AngieHinrichs @thomasppeacock

So, compared to BA.2.15, taking in to account the non sequenced areas : it changed the mutation in 22200 from G to A, giving the 215G changed to 215E And it seems to revert back 23075C to the WT T, meaning changing 505 from H to Y.

Lot of convergence. Particularly with BA.2.86: C23423T, C22295A, G22770A, G23222A (those also converge with other) And others like : G21969T - C.1.2,B.1.630 C21762T- BA.1 and BA.3

And many more.

if there will be more samples, Im gonna try do a deeper convergence analysis. But this one is really converged.

As to what each SNP means on its own biochemically, i keep my mind that in big saltations like this, its takes more than simple analysis to try and understand as the epistatic relationes that may be there.

FedeGueli commented 8 months ago

Thanks @shay671

corneliusroemer commented 8 months ago

Single sequence means it's entirely unclear if this is a chronic individual or whether there is community circulation. The fact that it's annotated as "pneumonia surveillance" is interesting but without knowing further background it's hard to interpret.

Resequencing could be useful to find out what's going on in the RBD where there's low coverage.

Summarizing what's known on phylogenetic placement, it's very clear BA.2.15 which as most South African BA.2 does not have the C9866T that is found in the vast majority of global BA.2*. This is exactly the same as with BA.2.86.

Below is a covSpectrum plot of which countries BA.2.15* was found in over the entire pandemic:

image

The closest relatives are 7 BA.2.15 sequences that have extra C16887T:

hCoV-19/USA/ID-IBL-827098/2022|EPI_ISL_13453608|2022-06-09
hCoV-19/USA/HI-H2212922/2022|EPI_ISL_13764765|2022-06-14
hCoV-19/South_Africa/NICD-N39016/2022|EPI_ISL_12307598|2022-03-31
hCoV-19/South_Africa/NICD-N39413/2022|EPI_ISL_12401143|2022-04-12
hCoV-19/USA/MD-CDC-QDX37841829/2022|EPI_ISL_13468835|2022-06-07
hCoV-19/Canada/ON-PHL-22-24207-v3/2022|EPI_ISL_13622990|2022-06-10
image

The fact Usher doesn't show this is due to artefactual reversions pulling the wrong things together.

JustinS6626 commented 8 months ago

If ORF8 expression is gone, what does that translate into, functionally speaking?

Possibly higher spike expression/infectivity. A million functions have been attributed to ORF8, and I think most of them are probably nonsense, but there was a paper that made a pretty convincing case that ORF8 reduces spike cell surface expression. https://www.jbc.org/article/S0021-9258(23)01983-X/fulltext And there appears to me to be genetic evidence for this as well, especially in the way XBB.2.3 (which has an intact ORF8) had very few RBD mutations, while XBB.1 lineages (with ORF8:G8*) added them quite frequently. I think oobb was the first to put this hypothesis forward.

I took a look at that article, and if I'm not mistaken, ORF8 being gone would mean that it would have to find another path in terms of immune evasion. Do you think I'm on the right track there?

ryhisner commented 7 months ago

I took a look at that article, and if I'm not mistaken, ORF8 being gone would mean that it would have to find another path in terms of immune evasion. Do you think I'm on the right track there?

That's the idea, yeah. It's far from certain, but I think some of the genetic data—particularly the striking difference in RBD mutations in XBB.1 vs. XBB.2.3, mentioned above—lends it support.

ryhisner commented 7 months ago

Cryptic lineages can persist. It survived in the Netherlands for months in early and mid 2023. If it had picked up the right mutations, it could have outcompeted XBB. But unfortunately, BA.2.86 was actually able to pull it off.

All of those BA.1 sequences were from the same patient. It's one of the most incredible examples we have of documented intrahost evolution. I hope a paper on this patient comes out at some point. As in some other chronic cases, there were clearly multiple intrahost lineages competing within the host.