cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.03k stars 97 forks source link

KP.2.3 (S:∆S31) + ORF1a:R1973S (60 seq, 17 countries, 5 continents; June 28) #2668

Open ryhisner opened 4 days ago

ryhisner commented 4 days ago

Description Transferred from https://github.com/sars-cov-2-variants/lineage-proposals/issues/1594

Sub-lineage of: KP.2.3 (= JN.1 + S:∆S31, S:H146Q, S:R346T, F456L, V1104L + ORF3a:K67N + ORf1a:T2283I) Earliest sequence: 2024-3-12, France (travel to USA, California), EPI_ISL_18999293 Most recent sequence: 2024-6-17, Ireland, EPI_ISL_19210778 Continents circulating: Europe (14), Asia (29), North America (10), Oceania (3) Countries circulating: Singapore (20), France (5), USA (5), Malaysia (4), Australia (3—different provinces), Ireland (4), England (3), India (2—1 travel to USA), Israel (2), Scotland (2), Canada (1), Denmark (1), Japan (1), Qatar (1—travel to USA), South Africa (1), South Korea (1) Number of Sequences: 49 GISAID Nucleotide Query: A6183G, A6184C, G25593T CovSpectrum Query: Nextcladepangolineage:KP.2.3* & A6184C Substitutions on top of KP.2.3: ORF1a: R1973S Nucleotide: A6184C, A29647G (3' UTR)

USHER Tree https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/KP.2.3_S31del_R1973S_16.json?c=gt-nuc_6184&gmax=7184&gmin=5184&label=id:node_2374230

image

Evidence This lineage, despite only consisting of 15 sequences, is present in 11 countries and four continents. The travel sequence originating in India could be particularly meaningful. (I am excluding the pooled Ginkgo Bozoworks sequences in these numbers, which are almost always duplicates and invariably of poor quality.)

This is the second nucleotide mutation in ORF1a:1973 in this lineage. ORF1a:K1973R is common to all BA.2.86.1 lineages (but not other BA.2.86 lineages, which have mostly died off). ORF1a:K1973R is the only amino acid difference between BA.2.86.1 and the other BA.2.86 sublineages, so it may have been relevant to BA.2.86.1's faster growth and subsequent global dominance over those competing sublineages.

image

ORF1a:1973 is NSP3_1155, which is in the nucleic-acid binding (NAB) domain of NSP3. The ultimate function of NAB is not well understood, but it is known to bind nucleic acids, preferring triple-G nucleotide motifs, also a characteristic of Mac3 (SUD-M) and DPUP (SUD-C), two other NSP3 domains. ORF1a:1973 is located 7 AA upstream of consecutive lysine residues known to be key to NAB's nucleic acid-binding activity and could therefore conceivably modulate the primary known property of NAB. Information above is from the paper "Nsp3 of coronaviruses: Structures and functions of a large multi-domain protein" by Jian Lei, Yuri Kusov, and Rolf Hilgenfeld, and is based off the SARS-CoV-1 NAB structure, which seems likely to very closely resemble the ASARS-CoV-2 structure.

image

EDIT: I now realize the SARS-CoV-2 NAB structure is actually available. Here it is, with ORF1a:K1973 highlighted in yellow and K/R residues known to be important in RNA binding in green.

image

Notably, K1973R is a reversion to the SARS-1/Bat-CoV residue. JN.1 lineages feature a striking number of such "reversions" to either the Bat-CoV (BC) residue, the SARS-CoV-1 (S1) residue or both, particularly in spike, including the N30 glycan (in the rising S:∆S31 lineages), S50L, K356T, R403K, N440K, N460K, L455S, F456L, P621S, and D796Y. Non-spike "reversions" or near-reversions to the SARS-CoV-1 (S1) and/or Bat-CoV (BC) AA residues include M:T30A, ORF1a:K1973R, ORF1a:A2710T (also in BA.1), and ORF1a:T4175I.

image

ORF1a:R1973S would therefore be a move away from the dominant SARS-1/Bat-Cov residue.

Genomes

Genomes EPI_ISL_18999293, EPI_ISL_19071309, EPI_ISL_19081749, EPI_ISL_19095987, EPI_ISL_19130848, EPI_ISL_19153297, EPI_ISL_19153979, EPI_ISL_19155431, EPI_ISL_19161406, EPI_ISL_19161657, EPI_ISL_19162457, EPI_ISL_19169585, EPI_ISL_19175928, EPI_ISL_19176426, EPI_ISL_19176768, EPI_ISL_19180508, EPI_ISL_19180936, EPI_ISL_19180959, EPI_ISL_19181120, EPI_ISL_19181198, EPI_ISL_19181301, EPI_ISL_19181307, EPI_ISL_19181347, EPI_ISL_19181367, EPI_ISL_19181389, EPI_ISL_19181570, EPI_ISL_19181852, EPI_ISL_19182055, EPI_ISL_19182291, EPI_ISL_19182293, EPI_ISL_19182303-19182304, EPI_ISL_19182367, EPI_ISL_19182461, EPI_ISL_19184455, EPI_ISL_19186571, EPI_ISL_19188063, EPI_ISL_19189909, EPI_ISL_19191306, EPI_ISL_19191375, EPI_ISL_19192454, EPI_ISL_19194490, EPI_ISL_19195198, EPI_ISL_19197835, EPI_ISL_19201151, EPI_ISL_19208442, EPI_ISL_19210778, EPI_ISL_19210788, EPI_ISL_19215116, EPI_ISL_19215118, EPI_ISL_19215121-19215122, EPI_ISL_19216219, EPI_ISL_19216466, EPI_ISL_19216523, EPI_ISL_19216686,
FedeGueli commented 4 days ago

High prevalence in Malaysia in June.