sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
43 stars 3 forks source link

KP.2.3 (S:∆S31) + ORF1a:R1973S (56 seq, 16 countries, 5 continents; June 26) #1594

Closed ryhisner closed 5 months ago

ryhisner commented 5 months ago

Description Sub-lineage of: KP.2.3 (= JN.1 + S:∆S31, S:H146Q, S:R346T, F456L, V1104L + ORF3a:K67N + ORf1a:T2283I) Earliest sequence: 2024-3-12, France (travel to USA, California), EPI_ISL_18999293 Most recent sequence: 2024-6-17, Ireland, EPI_ISL_19210778 Continents circulating: Europe (14), Asia (29), North America (10), Oceania (3) Countries circulating: Singapore (20), France (5), USA (5), Malaysia (4), Australia (3—different provinces), Ireland (4), England (3), India (2—1 travel to USA), Israel (2), Scotland (2), Canada (1), Denmark (1), Japan (1), Qatar (1—travel to USA), South Africa (1), South Korea (1) Number of Sequences: 49 GISAID Nucleotide Query: A6183G, A6184C, G25593T CovSpectrum Query: Nextcladepangolineage:KP.2.3* & A6184C Substitutions on top of KP.2.3: ORF1a: R1973S Nucleotide: A6184C, A29647G (3' UTR)

USHER Tree https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/KP.2.3_S31del_R1973S_16.json?c=gt-nuc_6184&gmax=7184&gmin=5184&label=id:node_2374230

image

Evidence This lineage, despite only consisting of 15 sequences, is present in 11 countries and four continents. The travel sequence originating in India could be particularly meaningful. (I am excluding the pooled Ginkgo Bozoworks sequences in these numbers, which are almost always duplicates and invariably of poor quality.)

This is the second nucleotide mutation in ORF1a:1973 in this lineage. ORF1a:K1973R is common to all BA.2.86.1 lineages (but not other BA.2.86 lineages, which have mostly died off). ORF1a:K1973R is the only amino acid difference between BA.2.86.1 and the other BA.2.86 sublineages, so it may have been relevant to BA.2.86.1's faster growth and subsequent global dominance over those competing sublineages.

image

ORF1a:1973 is NSP3_1155, which is in the nucleic-acid binding (NAB) domain of NSP3. The ultimate function of NAB is not well understood, but it is known to bind nucleic acids, preferring triple-G nucleotide motifs, also a characteristic of Mac3 (SUD-M) and DPUP (SUD-C), two other NSP3 domains. ORF1a:1973 is located 7 AA upstream of consecutive lysine residues known to be key to NAB's nucleic acid-binding activity and could therefore conceivably modulate the primary known property of NAB. Information above is from the paper "Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein" by Jian Lei, Yuri Kusov, and Rolf Hilgenfeld, and is based off the SARS-CoV-1 NAB structure, which seems likely to very closely resemble the ASARS-CoV-2 structure.

image

Notably, K1973R is a reversion to the SARS-1/Bat-CoV residue. JN.1 lineages feature a striking number of such "reversions" to either the Bat-CoV (BC) residue, the SARS-CoV-1 (S1) residue or both, particularly in spike, including the N30 glycan (in the rising S:∆S31 lineages), S50L, K356T, R403K, N440K, N460K, L455S, F456L, P621S, and D796Y. Non-spike "reversions" or near-reversions to the SARS-CoV-1 (S1) and/or Bat-CoV (BC) AA residues include M:T30A, ORF1a:K1973R, ORF1a:A2710T (also in BA.1), and ORF1a:T4175I.

image

ORF1a:R1973S would therefore be a move away from the dominant SARS-1/Bat-Cov residue.

Genomes

Genomes EPI_ISL_18999293, EPI_ISL_19071309, EPI_ISL_19081749, EPI_ISL_19095987, EPI_ISL_19130848, EPI_ISL_19153297, EPI_ISL_19153979, EPI_ISL_19155431, EPI_ISL_19161406, EPI_ISL_19161657, EPI_ISL_19162457, EPI_ISL_19169585, EPI_ISL_19175928, EPI_ISL_19176426, EPI_ISL_19176768
FedeGueli commented 5 months ago

Great catch thx, always interesting when one defining mutation changes further even if one emergence cannot be enough to evaluate if the the original mutation was beneficial or deleterious

FedeGueli commented 5 months ago

Notably, K1973R is a reversion to the SARS-1/Bat-CoV residue. JN.1 lineages feature a striking number of such "reversions" to either the Bat-CoV (BC) residue, the SARS-CoV-1 (S1) residue or both, particularly in spike, including the N30 glycan (in the rising S:∆S31 lineages), S50L, K356T, R403K, N440K, N460K, L455S, F456L, P621S, and D796Y. Non-spike "reversions" or near-reversions to the SARS-CoV-1 (S1) and/or Bat-CoV (BC) AA residues include M:T30A, ORF1a:K1973R, ORF1a:A2710T (also in BA.1), and ORF1a:T4175I.

Interestingly they go on getting other reversions to BAT as C22858T in the main branch of KP.3 ( silent so not sure if relevant)

ryhisner commented 5 months ago

With the big Singapore upload (and one more from Israel), this went from 15 sequences to 35 yesterday. Of the 19 new Singapore sequences, all were collected in May, and six were collected on May 27 or May 28.

New tree:

image
FedeGueli commented 1 month ago

designated KP.2.3.11

via https://github.com/cov-lineages/pango-designation/commit/d39304704862930caed86d10b5785fa1131b2b17