Closed BorisUitham closed 5 months ago
Notably, according to nextclade there is a lack of coverage between RBD residues 443-503. Who knows how many and what kind of mutations lie there.
Little addendum this vairant has two Orf9b mutations : Orf9b:G38S, Orf9b:V76I cc @ryhisner
@shay671 please take a look. @angiehinrichs @thomasppeacock
@silcn
notably it has ORF1b: G662S (BA.2.75, XBB, Delta)
ORF1b:A1219S was the hallmark of AY.100 one of the north american most fit Deltas
del509_523
Orf1a: del82-86
I thought for sure that the 2-nuc S:V367H was unique, but it turns out there was an AY.37 branch in western Europe with it in late 2021, which also had the very rare S:Y508H. Just 15 sequences though, and it did not have S:L368I like this one.
Combined with the S:∆24-27 deletion in all BA.2, S:T22N adds a glycan to replace the one lost with S:T19I.
As well as the already-noted long stretch of NNNs, there are a number of single-nucleotide Ns in the RBD. S:408 AGN (AGA in WT, AGT in BA.2) S:415 NCT (ACT in WT and BA.2) S:418 NTT (ATT in WT and BA.2) S:429 TTN (TTT in WT and BA.2) S:431 GGN (GGC in WT and BA.2) S:432 NGN (TGC in WT and BA.2) S:434 NTN (ATA in WT and BA.2) - this one is adjacent to G22865T = S:A435S S:438 NCT (TCT in WT and BA.2) S:439 ANC (AAC in WT and BA.2) S:440 AAN (AAT in WT, AAG in BA.2) S:441 CTN (CTT in WT and BA.2) S:443 NCT (TCT in WT and BA.2)
I would guess most of these are not hiding mutations, but with everything else going on here, who knows?
Usher tree placement seems uncertain. It is currently on a big branch with A13533G followed by C27944T. This sequence has C27944T but not A13533G, and C27944T is homoplasic enough that I'd lean towards independent acquisition of 27944T being more likely than reversion of 13533G. But I can't see any obvious placement if we make that assumption.
A27892G de-optimises the ORF8 TRS edit: for whatever reason this one isn't very common - largest lineage containing it was a substantial branch of XB (yes, XB not XBB, all the way back in 2021)
Decent number of reversions to the SARS-1/Bat-CoV residues in this one, as with BA.2.86. These are on top of the ones normal BA.2 has, which include S:A27S (T in S1/BC), S:G339D, S:T478K, and S:D796Y. I assume it also has S:N460K, but the dropout makes that hard to confirm.
I still don't know why such reversions are seen in so many chronic-infection sequences/lineages, but there's no question the tendency exists.
A27892G de-optimises the ORF8 TRS
I would guess that pretty much kills ORF8 expression. C27889T nearly zeroed out ORF8 expression, and I bet this one does the same.
If ORF8 expression is gone, what does that translate into, functionally speaking?
Thx @silcn and @ryhisner
If ORF8 expression is gone, what does that translate into, functionally speaking?
Possibly higher spike expression/infectivity. A million functions have been attributed to ORF8, and I think most of them are probably nonsense, but there was a paper that made a pretty convincing case that ORF8 reduces spike cell surface expression. https://www.jbc.org/article/S0021-9258(23)01983-X/fulltext And there appears to me to be genetic evidence for this as well, especially in the way XBB.2.3 (which has an intact ORF8) had very few RBD mutations, while XBB.1 lineages (with ORF8:G8*) added them quite frequently. I think oobb was the first to put this hypothesis forward.
The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.
The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.
Not necessarily. Surveillance has been phased down extensively worldwide, with many infected not getting tested at all and when they are tested, only few get sequenced.
But sure, the fact that the sample is almost two months old gives hope that this one just fizzled out.
The sample was from a month ago. If it is a fit variant, it would already be in multiple countries.
Not necessarily. Surveillance has been phased down extensively worldwide, with many infected not getting tested at all and when they are tested, only few get sequenced.
But sure, the fact that the sample is almost two months old gives hope that this one just fizzled out.
If it was from that long ago, I am almost sure that it has fizzled out. Hyping this up is not to be advised. It is highly unlikely that a variant would have the exact same growth advantage as JN.1 causing it to spread that slowly. Usually, these variants fizzle out quick or spread to other nations fast. However, there was a cryptic ba.1.1 decendent in the netherlands that survived for a few months in early to middle 2023. Here is a spike comparation against JN.1 in cov-spectrum:
Cryptic lineages can persist. It survived in the Netherlands for months in early and mid 2023. If it had picked up the right mutations, it could have outcompeted XBB. But unfortunately, BA.2.86 was actually able to pull it off.
@shay671 please take a look. @AngieHinrichs @thomasppeacock
So, compared to BA.2.15, taking in to account the non sequenced areas : it changed the mutation in 22200 from G to A, giving the 215G changed to 215E And it seems to revert back 23075C to the WT T, meaning changing 505 from H to Y.
Lot of convergence. Particularly with BA.2.86: C23423T, C22295A, G22770A, G23222A (those also converge with other) And others like : G21969T - C.1.2,B.1.630 C21762T- BA.1 and BA.3
And many more.
if there will be more samples, Im gonna try do a deeper convergence analysis. But this one is really converged.
As to what each SNP means on its own biochemically, i keep my mind that in big saltations like this, its takes more than simple analysis to try and understand as the epistatic relationes that may be there.
Thanks @shay671
Single sequence means it's entirely unclear if this is a chronic individual or whether there is community circulation. The fact that it's annotated as "pneumonia surveillance" is interesting but without knowing further background it's hard to interpret.
Resequencing could be useful to find out what's going on in the RBD where there's low coverage.
Summarizing what's known on phylogenetic placement, it's very clear BA.2.15 which as most South African BA.2 does not have the C9866T
that is found in the vast majority of global BA.2*. This is exactly the same as with BA.2.86.
Below is a covSpectrum plot of which countries BA.2.15* was found in over the entire pandemic:
The closest relatives are 7 BA.2.15 sequences that have extra C16887T:
hCoV-19/USA/ID-IBL-827098/2022|EPI_ISL_13453608|2022-06-09
hCoV-19/USA/HI-H2212922/2022|EPI_ISL_13764765|2022-06-14
hCoV-19/South_Africa/NICD-N39016/2022|EPI_ISL_12307598|2022-03-31
hCoV-19/South_Africa/NICD-N39413/2022|EPI_ISL_12401143|2022-04-12
hCoV-19/USA/MD-CDC-QDX37841829/2022|EPI_ISL_13468835|2022-06-07
hCoV-19/Canada/ON-PHL-22-24207-v3/2022|EPI_ISL_13622990|2022-06-10
The fact Usher doesn't show this is due to artefactual reversions pulling the wrong things together.
If ORF8 expression is gone, what does that translate into, functionally speaking?
Possibly higher spike expression/infectivity. A million functions have been attributed to ORF8, and I think most of them are probably nonsense, but there was a paper that made a pretty convincing case that ORF8 reduces spike cell surface expression. https://www.jbc.org/article/S0021-9258(23)01983-X/fulltext And there appears to me to be genetic evidence for this as well, especially in the way XBB.2.3 (which has an intact ORF8) had very few RBD mutations, while XBB.1 lineages (with ORF8:G8*) added them quite frequently. I think oobb was the first to put this hypothesis forward.
I took a look at that article, and if I'm not mistaken, ORF8 being gone would mean that it would have to find another path in terms of immune evasion. Do you think I'm on the right track there?
I took a look at that article, and if I'm not mistaken, ORF8 being gone would mean that it would have to find another path in terms of immune evasion. Do you think I'm on the right track there?
That's the idea, yeah. It's far from certain, but I think some of the genetic data—particularly the striking difference in RBD mutations in XBB.1 vs. XBB.2.3, mentioned above—lends it support.
Cryptic lineages can persist. It survived in the Netherlands for months in early and mid 2023. If it had picked up the right mutations, it could have outcompeted XBB. But unfortunately, BA.2.86 was actually able to pull it off.
All of those BA.1 sequences were from the same patient. It's one of the most incredible examples we have of documented intrahost evolution. I hope a paper on this patient comes out at some point. As in some other chronic cases, there were clearly multiple intrahost lineages competing within the host.
Nucleotide search: G28028C, G28395A, G28509A (alternative query : C6285T, C4596A,T17088C edited) EPI_ISL_18839074
nuc unique T3730C, C4596A, A7429G, T16002C, T17088C, G17706A, T21015C, G21606T, C21627A, A21788C, G21848C, G21969T, T22020C, T22566C, T22662A, G23258A, G23264A, C23277G, G23587C, A23598G, G23856T, C24503A, G24872C, G25534A, G25593C, C26313A, G26314A, A27353T, C27509T, A27892G, G28028C, G28395A, G28509A, C28957T, G29072T nuc homoplasies C6285T, C6762T, C6936T, G15451A, C15720T, G17122T, T18042C, A19476G, C19554T, C20719T, C21762T, G21795T, C22000T, A22001G, C22050T, A22115T, G22200A, C22295A, G22332A, G22599C, C22624A, C22664A, G22770A, G22865T, G23222A, C23423T, C25413T, C26894T, C27476T, C27577T, G29151A, C29627T
ORF1a: unique T1444N
ORF1a T2007I, T2166I, S2224F, ORF1b: G662S , A1219S (Edited) S unique C15F, T22N, T76P, E96Q, C136F, M153T, L335S, V367D, G566S, D568N, T572S, K679R, R765L, L981I S homoplasies A67V, R78M, K147E, A163V, N185Y, G213E, H245N, G257D, R346T, N354K, L368I, R403K, A435S, E554K, P621S, Q675H, V1104L
ORF3a unique V48I, K67N
E unique F23L, V24M
ORF6 unique Q51L
ORF7a unique T39I
ORF7a homoplasies T28I, Q62*
ORF8 unique W45C Orf9b: G38S,V76I (edited)
N unique R41Q, S79N, A267S N homoplasies R293K
ORF10 homoplasies R24C
Inheriting from BA.2.15: C16877T, G21753T (S:W64L) C27944T. Furthermore, usher places it on a branch which may or not be an artefact with G23055A and C23075T which result in the spike reversions of H505Y and R498Q. The C15 - C136 disulfide bridge is lost by both sites mutating to F.
https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_13756_8e6ef0.json?label=id:node_4836261
Nextclade (edited)
Large deletion in NSP1 (very recurrent indel region) (edited):