cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

Sublineage of BA.5.6 with S:R346E (34 seq, 12 countries) #1191

Closed ryhisner closed 2 years ago

ryhisner commented 2 years ago

Description

Sub-lineage of: BA.5.6 Earliest sequence: 2022-8-25, England — EPI_ISL_14858696 Most recent sequence: 2022-9-28, Denmark — EPI_ISL_15267310 Countries circulating: Austria (9), Denmark (2), USA (2), Czech Republic (1), England, Germany (1), Portugal (1), Switzerland (1) Number of Sequences: 20 (including nine spike-only sequences from Austria) GISAID Query: Spike_R346E, NSP16_T140I CovSpectrum Query: Nextcladepangolineage:BA.5.6* & S:R346E Substitutions on top of BA.5.6: Spike: R346E Nucleotide: T979A, A22598G, G22599A

USHER Tree https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/subtreeAuspice1_genome_553f_bfc60%E2%80%94BA.5.6%2BR346E.json

image

Evidence S:346 has proven to be one of the most important RBD sites for immune evasion. As at most RBD sites, many theoretically possible mutations involve such a large reduction in ACE2 binding strength or RBD expression (a proxy for stability) that they are untenable, as can be seen in the Bloom Lab RBD ACE2 Heat Map. https://jbloomlab.github.io/SARS-CoV-2-RBD_DMS_Omicron/RBD-heatmaps/

Using the Bloom Lab RBD Heat Map figures for a BA.2 background, I ranked the ACE2 binding strength and RBD expression scores for each possible R346 mutation, from best to worst. Unsurprisingly, R346T comes in first for both categories. R346K ranks high for both, but as both arginine and lysine have similar properties, including being positively charged, it seems likely that R346K is one of the weakest mutations in terms of antibody evasion. Of the 19 possible AA substitutions at S:R346, R346E ranks 4th-highest for both ACE2 affinity and RBD expression.

image

Furthermore, since glutamic acid is negatively charged, it seems likely to be one of the best substitutions in terms of immune evasion. Indeed, the recent study by Yunlong Cao and Fanchong Jian ranked R346E as the top mutation in terms of evading neutralizing antibodies in convalescent plasma from BA.1, BA.2, and BA.5 infections.
https://www.biorxiv.org/content/10.1101/2022.09.15.507787v3

image

R346E is a two-nucleotide mutation, which is the most likely reason we have not seen it in any previous lineage thus far. But as immune evasion becomes increasingly important in exerting selection pressure, such rare, two-nucleotide mutations become less unlikely than before.

image

Given its international spread and recent rapid emergence, I think this lineage should be followed closely.

Genomes

Genomes EPI_ISL_14858696, EPI_ISL_14941950, EPI_ISL_15157357, EPI_ISL_15161676, EPI_ISL_15173496, EPI_ISL_15196538, EPI_ISL_15196903, EPI_ISL_15208510, EPI_ISL_15213255, EPI_ISL_15235362, EPI_ISL_15258381, EPI_ISL_15265092, EPI_ISL_15267310, EPI_ISL_15288725, EPI_ISL_15288727, EPI_ISL_15288783, EPI_ISL_15288833, EPI_ISL_15289122, EPI_ISL_15289732, EPI_ISL_15290919
silcn commented 2 years ago

Just to note that while R346E isn't in any designated lineage, it was seen before in the small 2nd gen BA.1.1 cluster mentioned in #588, and I made some similar observations about the basic-to-acidic switch at the time.

corneliusroemer commented 2 years ago

Thanks, while small and only single mutation - this is an issue that may be worth designating due to the interesting mutation.

ryhisner commented 2 years ago

Seven sequences of this were uploaded today, five of them spike-only from Austria. Assuming the Austrian spike-only sequences are all a part of this lineage, as seems all but certain, this is now up to 34 sequences from 12 different countries (Israel, Austria, Czech Republic, Denmark, Germany, Italy, Norway, Portugal, Switzerland, England, USA, Australia).

InfrPopGen commented 2 years ago

Thanks for submitting. We've added lineage BA.5.6.4 with 5 newly designated sequences, and 0 updated designations. Defining mutations A22598G (S:R346E), nt:G22599A.

icestorm972 commented 1 year ago

FYI: In Germany RKI DESH data contains now 20 sequences as of today... Figure 2022-11-08 205656 (all the dark pink ones on the left side are classified by nextclade as BA.5.6.4)