cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

Omicron sublineage with potentially beneficial mutation S:346K #360

Closed corneliusroemer closed 2 years ago

corneliusroemer commented 2 years ago

Too early for a lineage designation but worth watching:

About 10% of Omicrons (47) have the mutation S:346K which has already been seen in Mu and B.1.640.

Using the available sequences from GISAID it seems possible that Omicrons with S:346K grow faster than Omicrons without.

image

https://nextstrain.org/groups/neherlab/ncov/21K-diversity

S:346 is also inferred to have a transmission advantage by @fritzo's pyro-cov: https://github.com/broadinstitute/pyro-cov/blob/master/paper/mutations.tsv#L281

zach-hensel commented 2 years ago

I think this lesson from Delta is that it's never too early to recognize diversity that exists early on and also the sub-sub-lineage with T716I (shared with Alpha, B.1.575, and apparently emerging within the last two weeks) and the other sub-lineage with A701V (shared with Beta, majority of B.1.526, and emerging in several AY.x) if those are legit per the sequencing and case data.

Edit: Looking back at this now and noting that BA.1.1 + S:T716I remains rare.

fritzo commented 2 years ago

@zach-hensel agreed. We recently started using a finer clustering than PANGO lineages in our growth rate estimation tool, and found some PANGO lineages contain significant heterogeneity, with growth rate estimates of finer clusters within a given lineage varying by nearly a factor of two. We have not yet run on Omicron data.

[![lineage_heterogeneity](https://user-images.githubusercontent.com/648532/144757667-5dab07da-b725-4e31-9d79-3c1ea62856ea.png)](https://github.com/broadinstitute/pyro-cov/blob/master/paper/lineage_heterogeneity.png)
georgimarinov commented 2 years ago

B.1.640 has R346S (and so do a few other minor ones such as C.36.3), not R346K, but that also seems to be beneficial. I have always been puzzled by the R->K change, it isn't that much of a difference, while in the same time you have R->S in other lineages, which is a big difference, and yet both give it a boost. You see a lot of recurrent R346S in various Delta sublineages, and in other variants too.

FedeGueli commented 2 years ago

Update from Sergej Pond on S:346K (whole thread)

https://twitter.com/sergeilkp/status/1468702272272183300?s=20

FedeGueli commented 2 years ago

I noticed that Sigal lab neutralization assay is with 346K mutation. Put here maybe someone interested in.

FedeGueli commented 2 years ago

This clade is growing quite everywhere

Click to see plots Usa ![Screenshot_2021-12-28-13-09-31-782_com android chrome](https://user-images.githubusercontent.com/87669813/147599632-1dc39820-56d7-4c74-993a-d1761b8e0c54.jpg) Switzerland (pic) ![VariantTimeDistributionChart (18)](https://user-images.githubusercontent.com/87669813/147599357-2c02a529-7143-4540-9cc0-bbceeeec4b13.png) Denmark (pic) ![VariantTimeDistributionChart (19)](https://user-images.githubusercontent.com/87669813/147599558-ffa5035a-933f-4a94-9618-7a7d57835de5.png)

I think it is worth a designation @chrisruis

silcn commented 2 years ago

@FedeGueli every clade of Omicron is growing like this everywhere. It's not clear yet whether this is growing relative to the rest of Omicron.

corneliusroemer commented 2 years ago

Indeed @silcn, one would have to look at fraction of this relative to BA.1.

AFAIK @chaoran-chen is working on this for covSpectrum.

shay671 commented 2 years ago

image

chrisruis commented 2 years ago

Looking at the latest data on covSpectrum, this clade has increased as a proportion of BA.1 in some countries but not all countries. So it's not particularly conclusive but I think warrants a designation. We've therefore added this as BA.1.1 in v1.2.122 to start on the branch with G22599A (S:R346K)

MCB6 commented 2 years ago

Just a warning about occasional inability of the labs to call S:346K. In the US, there is an apparent regional anomaly with the absence of BA.1.1 in the more recent samples from Massachusetts / New England. But it's only because the bulk of the regional sequencing (and nearly all faster-turnaround sequencing) is done by Broad Inst. Its pipeline won't call R346K. The rest of the samples from the states where Broad draws samples from have ~2/3 BA.1.1. as expected. Because Broad is responsible for 20-30% of the recent samples in the US, it distorts the nation-wide stats as well, but the effect is much stronger locally. Check for example New Hampshire, where December genomes came from a variety of labs, but January genomes from Broad: image

bwlang commented 2 years ago

I think that region of the S is going to be a problem for all common amplicon sets except VarSkip Long and VarSkip V2.

I'm trying to get to an update of primer-monitor.neb.com/lineages today to confirm with current variants - meanwhile (orange indicates a variant overlap with the primer, vertical lines indicate S346). Too many fires these days! The VarSkip Long primers are not visible since the amplicon is 1.4kb, but they are unaffected by variants so far. image

Anecdotally, I see a mutation in that codon in 9/15 samples sequenced last week at NEB using VarSkip V2 (low volume sequencing representing southern New Hampshire and Massachusetts) e.g. image

arodzh-sudo commented 2 years ago

@bwlang are the Varskip Short v2 commercially available?

bwlang commented 2 years ago

@arodzh-sudo Yes - they are not on the neb website yet (very soon) but we have completed large scale production and can ship them now if necessary.

dpark01 commented 2 years ago

@MCB6 thanks for flagging the issue and it's certainly something I want to pay attention to at the Broad.

I'll just note, the data I'm seeing, even from New Hampshire specifically, doesn't seem to line up with what you're seeing. Here is a plot of all Omicrons that the Broad has sequenced from New Hampshire broken down by PANGO lineage (I chose NH because you showed a plot of NH). I had to go back and re-run the latest pango on all the old genomes, but the following plot is 100% called with pangolin: 3.1.19; pangolearn: 2022-01-20; constellations: v0.1.2; scorpio: 0.3.16; pango-designation used by pangoLEARN/Usher: v1.2.123 (pUSHER was used for calls) -- the Broad only genomes show an increase in BA.1.1 in NH (and similar patterns throughout New England) right at the turn of the new year.

newplot (1)

If NEB varskip short 1a (used by the Broad) is vulnerable to amplicon dropouts on S:346, then either:

  1. there was an unusual reduction in amplicon dropout rates around the new year for some non-epidemiological reason
  2. or BA.1.1 has other SNPs in LD with S:346 that pUSHER is able to leverage to call BA.1.1 anyway, even in the absence of S:346

I haven't looked systematically at our genomes to see if the calls are correlated with coverage on Spike (which would be a bit of a wet lab mystery) or if these are being called despite the lack of coverage there (which would be phylogenetically interesting).

MCB6 commented 2 years ago

@dpark01 So you think that the problem is specific to GISAID's lineage calls in BI's submissions? They don't have any NH BA.1.1's for this recent date range: image and the only call with S:346K during this time frame is from MA, hCoV-19/USA/MA-CDCBI-CRSP_UUG4JE6ZCEKCDMZL/2022

LD variants is an interesting issue. covSpectrum has 65 LD (>0.05) calls for ~100,000 US BA.1.1's, but, alas, they are all common with BA.1,

MattBashton commented 2 years ago

I can't see any issues with calling S:R346K even on Artic V4 which this is an example of, it's not overlapping with any one primer specifically, see track below reads. The region with the dropout in would be the adjacent amplicon 76 (a dropout which is now fixed in V4.1), this region is covered by amplicon 75 and 75_LEFT and 75_RIGHT don't appear to be having any issues with depth here: Screenshot 2022-02-03 at 20 00 35