cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 97 forks source link

XBB.1.5 sibling lineages with S:F486P (merge into XBB.1.5?) #1491

Closed silcn closed 1 year ago

silcn commented 1 year ago

XBB.1.5 is defined by S:F486P and T17124C. On the Usher tree, S:F486P comes first, and there are 31 sequences that have S:F486P but apparently not T17124C. Some of these may just be XBB.1.5 sequences with T17124C missing due to lack of coverage or an artefact, but there are a few small clusters that appear to be genuine. They can be seen in this Usher tree: https://nextstrain.org/fetch/github.com/silcn/subtreeAuspice1/raw/main/auspice/subtreeAuspice1_genome_30f16_ddbc50.json?c=gt-nuc_20379&label=id:node_7686144

The largest of these clusters is defined by A20379G and the highly homoplasic C28312T. It has 8 sequences:

Malaysia_XBB1_486P

Ordinarily I would propose broadening XBB.1.5 to start from S:F486P to include these additional sequences. However, there is reason to believe that the XBB.1+S:F486P branch shown by Usher is not monophyletic and some of the small branches represent independent acquisitions of S:F486P. We would expect to find sequences ancestral to XBB.1.5 in the US, but several of these sibling branches appear not to have originated there: in addition to the Singapore/Malaysia branch shown above, there is an Australian branch defined by A4758C and T18126A with 5 sequences (actually 6, one hasn't appeared on Usher yet), and an Austrian branch with 3 sequences defined by C18591A and T20109C (this one could be larger, as it might account for some of the Austrian spike-only sequences that are called as XBB.1.5).

Designating all of these as XBB.1.5 would certainly be easiest from a practical point of view, as well as less confusing, but it wouldn't technically be correct. I will leave it up to the team to decide what to do, but thoughts are welcome.

silcn commented 1 year ago

On closer inspection, there is a larger Austrian branch of XBB.1 with C18591A, so the 3-sequence Austrian cluster actually belongs there and is being misplaced by Usher; I may propose it if it grows or appears outside Austria. The Malaysia/Singapore and Australia branches are correctly placed (there are more XBB.1 sequences with C28312T but they're scattered across the tree).

silcn commented 1 year ago

On even closer inspection, that 4-sequence slightly longer branch from Malaysia and Japan in the Usher tree above is actually a recombinant! It has picked up a bit of some BA.2.75 sublineage in ORF1a; first breakpoint is 406-3795 and second is 5184-12443.

FedeGueli commented 1 year ago

@silcn could you write a proposal for that? so it will be easier to monitor it?? i didnt get well which one you mean: the one at the top of the tree? thx!

corneliusroemer commented 1 year ago

@silcn raised great points here that I've also been pondering again lately.

I favour broadening XBB.1.5 to not require T17124C for three reasons:

  1. There is no clear evidence that T17124C happened first before S:F486P - all XBB.1 we may find that have T17124C but not S:F486P may well be due to dropout, so this question can't be answered now and probably never will be answerable even with much more data.
  2. If we don't broaden XBB.1.5, we would have to start labelling all those small lineages as XBB.1.N, always missing some - this is tedious. It also doesn't help people that they need to remember which of the XBB.1.N have 486P and which don't. It would be a mess with no benefit.
  3. Broadening has essentially no bad side effects. T17124C is a synonymous mutation. So any conclusions reached by actual lab research using the previous definition of XBB.1.5 is still valid.

It just seems pragmatic to broaden. Do you agree @silcn @AngieHinrichs @thomasppeacock @chrisruis @InfrPopGen?

This is one nice lineage that is currently XBB.1 and would become part of XBB.1.5 (with A4758C, T18126A rather than T17124C)

image

Here's another one with C2710T rather than T17124C

image
thomasppeacock commented 1 year ago

I agree, think this simplifies the lineage and future offshoots

corneliusroemer commented 1 year ago

Another lineage that would become XBB.1.5 is this green one at the top, that may well be the XBB.1* donor of XBL (in yellow)

image
corneliusroemer commented 1 year ago

@AngieHinrichs it would be great to have your view on this as well!

AnonymousUserUse commented 1 year ago

@corneliusroemer Why do you think there is no evidence that XBB.1 acquires T17124C before S:F486P, but insists in https://github.com/cov-lineages/pango-designation/issues/1602 that XBB.1.9 acquires C11956T before S:F486P? Are there any differences between them? Thanks.

AngieHinrichs commented 1 year ago

I'm in favor of broadening XBB.1.5 to not require T17124C, only T23018C (S:F486P). lineage_notes.txt does not mention T17124C, only S:F486P:

XBB.1.5 USA, S:F486P
corneliusroemer commented 1 year ago

I'm in favor of broadening XBB.1.5 to not require T17124C, only T23018C (S:F486P). lineage_notes.txt does not mention T17124C, only S:F486P

Great, thanks @AngieHinrichs, then let's do that!

Just as a side, I don't usually mention nucleotide mutations, but when I designated it first it was only sequences with T17124C. Functionally it shouldn't matter much and it simplifies our lives a lot.

@AnonymousUserUse the difference is that XBB.1.9.1/2 appear to be distinct, with feasibly 3 mutations separating them. It's unlikely that 486P would have occurred and then we'd see only two lineages with 1/2 extra mutations respectively. The few sequences with fewer nucs are possibly artefacts.

We haven't seen many XBB.1 with T17124C and just lacking 486P, hence the lack of evidence.

image

Ultimately when there is no clear right/wrong it's a question of what is most practical. We could create lots of XBB.1.N but that would hardly help as I argued above.

Meanwhile, there seems to be no similar risk of having a dozen XBB.1.9.N as so far there are just 2 clear clusters.

FedeGueli commented 1 year ago

@corneliusroemer @AngieHinrichs @thomasppeacock @InfrPopGen while looking at the big XBB.1+486P tree it seems a bit messy at first look to understand why one lineage is xbb.1.5 and the other not, maybe we should step back and rename all the S:F486P without T17214C as XBB.1.X (and not XBB.1.5.X)??

Schermata 2023-03-23 alle 11 26 30