sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
43 stars 2 forks source link

XBB.1.9.2 sublineage with S:A344V (17 seqs in UK) #280

Closed HynnSpylor closed 11 months ago

HynnSpylor commented 1 year ago

It is a sibling lineage in #57 first spotted by @FedeGueli

Defining mutations: XBB.1.9.2>C4686T (Orf1a:T1474I) >A16373G (Orf1b:N969S)>C22593T (S:A344V), A29125G GISAID query: C4686T, A16373G, C22593T, A29125G Earliest seq: 2023-05-10 (England, EPI_ISL_17693007) Most recent seq: 2023-06-09 (Scotland, EPI_ISL_17884588) Detected Countries: UK (10)

Usher Tree: QQ截图20230629214649

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_2f34b_d770c0.json

4 of 10 seqs further get S:R403K (on 2 branches, one branch also with S:E654A)

Genomes: EPI_ISL_17693007, EPI_ISL_17776830-17776831, EPI_ISL_17796661, EPI_ISL_17820821, EPI_ISL_17884588, EPI_ISL_17884618-17884621,

FedeGueli commented 1 year ago

Thx @HynnSpylor interesting how it got 403K back very soon in the 344V branch. Is there any known interaction between 344 and 403? @Sinickle @oobb45729

corneliusroemer commented 1 year ago

Yeah this is a weird branch that I've also noticed. I wanted to designate S:A344T+S:403K but usher is too messed up right now. Would be great if @AngieHinrichs could nuke all sequences in those S:A344X+S:403K branches and see if Usher fixes it on second try.

AngieHinrichs commented 1 year ago

Would be great if @AngieHinrichs could nuke all sequences in those S:A344X+S:403K branches and see if Usher fixes it on second try.

At least some of the S:A344X amino acid changes in XBB.1.9.2 are caused by different nucleotide mutations, and re-running usher won't join those together...

The S:403K branches do seem to be consistently G22770A (although there's also A22771C, e.g. CHN/HB-Jingzhou-2101/2023). I see a one branch where G22592A is followed by different branches getting G22770A (Germany/BB-RKI-I-1136867/2023, England/PHEC-YYEA518/2023). So maybe there are some sequences that missed G22770A due to Ns or something. I can try the usual prune, opt & replace. That branch is distinct from the branch in this proposal, though (T16548G > G21255T,G22592A vs C29347T > C4686T > A16373G > ...).

On the branch in this proposal, it looks like some sequences got G22592A (S:A344T) and C13767T and then got G22770A (S:R403K), while other sequences got C22593T (S:A344V) and A29125G and then some of those got T24979C and then G22770A (S:R403K). I don't see a problem with this branch... if I'm missing something, please provide some sequence names or IDs that look misplaced.

corneliusroemer commented 1 year ago

Sorry @AngieHinrichs, I wasn't as clear as I should have been.

On the surface this issue may be about another lineage, but I suspect there's a good chance this is in fact one and the mess is due to artefacts (not Usher's fault ;) )

Most sequences in XBB.1.9.2 with S:A344 mutated also have S:R403K or at least have unknowns there.

I consider it unlikely that 403 and 344T/V arose homoplasically.

I think nuking and rebuilding would be helpful. Maybe you can seed it first with good quality sequences and place the ones with unknowns later?

image image
AngieHinrichs commented 1 year ago

@corneliusroemer I can't see the full sequence names of some of those sequences with big blocks of Ns (like Germany/B[BE]-ChVir-LB23012...??), but I don't see names like that in this branch of the UShER tree either -- those sequences might have had too many equally parsimonious placements (EPPs) due to the Ns and been rejected. If you can list specific sequence names/IDs that you think are misplaced (or that have especially high quality and should be kept), that would be very helpful.

AngieHinrichs commented 12 months ago

@corneliusroemer This is looking cleaner in the 2023-07-19 tree: S:R403K (G22770A) first, then separate branches with S:A344V (C22593T) and S:A344T (G22592A) along with other mutations that distinguish the two branches.

image

https://nextstrain.org/fetch/hgwdev.gi.ucsc.edu/~angie/lineage-proposals-280.json?branchLabel=Spike%20mutations&c=gt-nuc_22770,22592,22593&label=id:node_6685287

FedeGueli commented 12 months ago

4 more seqeunces on Gisaid from Scotland uploaded today 17 on gisaid likely more than that on Usher

FedeGueli commented 11 months ago

it is not totally overlapping with EG.9 but ok to add a milestone to it.