cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 97 forks source link

Multiple branches of XBB.1.9.1 and XBB.1.9.2 #1708

Closed FedeGueli closed 1 year ago

FedeGueli commented 1 year ago

CC @corneliusroemer before proceeding with further designation of XBB.1.9.1/2 sublineages i think it is worth to make a list of main branches: XBB.1.9.2 Gisaid query : G5720A, T28297C,C12789T,T23018C ,A16878T 1 N:L219F: designated with N:L219F 2 C28651T: EG.4 3 G2659T (ORF1a:K798N),C8752T : 26seqs Now EG.13 4 C22593A (S:A344D), A28012G (ORF8:H40R) 14 seqs Germany 5 T7995G (ORF1a:V2577G) 12 seqs 6 C21998A (S:H146N) 19seqs 7 C4423A 6seqs 8 C20016T 7 seqs (India)

XBB.1.9.1 . Gisaid query:G5720A, T28297C,C12789T,T23018C,C11956T 1 T1753C 444 seqs ( see #1705 ) 2 G16741T Orf1b:V1092F 179 seqs 3 G2235A (ORF1a:C657Y) 45 seqs 4 G4354A now designated FL.13 5 T4579A 73 seqs >> Now FL.2 via https://github.com/cov-lineages/pango-designation/commit/55d4ad1c2ba0b158db0542dbe630bccaced171d3 6 A14109G (Orf1b:I214M) 73 seqs 7 T29691C 47seqs 8 G6894T (ORF1a:C2210F) 33 seqs 9 A25576T (ORF3a:I62F) 38 seqs 10 C28333T (ORF9b:P17L ) 36seqs (proposing it) 11 G11504A (ORF1a:V3747I) 20 seqs

Tree: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_26dfc_df2900.json?label=id:node_7639664

Schermata 2023-02-28 alle 22 10 40
AnonymousUserUse commented 1 year ago

Are there any updates to this issue?

oobb45729 commented 1 year ago

T4579A is not an ordinary synonymous mutation. It happens very often for a T to A mutation. It creates a TRS-like sequence CTAAACGA and it is often seen together with A4576T, which extends it to CTCTAAACGA. @ryhisner @alurqu

ryhisner commented 1 year ago

Indeed, T4579A is part of one of the most extensive novel TRS-B motifs that I've come across, which in its most complete version includes the mutations A4571T, A4572G, C4573T, A4574T, A4576T, and T4579A. The most recent sequence with all of these was an XBB.1.5 from Washington state, USA uploaded on April 4. There have been dozens of sequences with all six of these mutations, though it's much more common to see only A4576T and T4579A together.

image
ryhisner commented 1 year ago

The A4576T-T4579A combination was present in BA.5.2.23, which, for a time, made up about one-third of all cases in Costa Rica (and possibly neighboring counties with little to no genetic surveillance). The relative success of BA.5.2.23 was surprising to me as it was much less immune-evasive than other lineages that were circulating at the time (early September through early November 2022). Its only real immune-evasion mutation was S:V445A. BA.5.2.23's other three mutations were the aforementioned synonymous A4576T and T4579A and the neutral, likely APOBEC-induced S:S255F (C22326T). Not only did BA.5.2.23 lack any mutation at S:R346X or S:K444X, it also did not possess ORF1b:T1050N, the BA.5.2* mutation that was globally dominant at one point due to the substantial growth advantage it conferred.

All of this suggests to me that the A4576T-T4579A combo likely increases fitness, at least in the context of the global population immunity present in late fall/early winter of 2022. Whether it does so by expresing a new peptide—which would consist of an 11-AA, out-of-frame, mostly hydrophobic KLLLQVYLAM from 4590-4622—or through some other mechanism I don't know.

image
FedeGueli commented 1 year ago

S

T4579A is not an ordinary synonymous mutation. It happens very often for a T to A mutation. It creates a TRS-like sequence CTAAACGA and it is often seen together with A4576T, which extends it to CTCTAAACGA. @ryhisner @alurqu

T4579A is also in FE.1.1 cc @ryhisner

alurqu commented 1 year ago

S

T4579A is not an ordinary synonymous mutation. It happens very often for a T to A mutation. It creates a TRS-like sequence CTAAACGA and it is often seen together with A4576T, which extends it to CTCTAAACGA. @ryhisner @alurqu

T4579A is also in FE.1.1 cc @ryhisner

It looks to me like you're starting the sequence from 4577 or, with A4576T, from 4575.

My attention also goes to the sequence starting from 4580: In WT, that's AACGATC compared to the canonical sarbecoronavirus TRS-B AACGAAC. 4585 differs, but it is close enough to do something as is the WT TRS-B UACGACC for ORF4/E? Could extended homology from T4579A boost that? I'll also note that T4585A has been sampled 277 times including twice in the last six months.

If a new TRS-B near 4580 does become active, the next start codon may be out-of-frame at 4590 to 4592 with a stop codon at 4623 to 4625. So assuming no intervening mutations to add or remove stop codons in this frame, it might encode an only 11-residue peptide which, if I have translated correctly, would be MKLLLQCHLAM.