sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
42 stars 2 forks source link

KP.2.3 (S:∆S31) + S:S155N (glycan) [25 seq; USA, Italy; Aug 6] #1759

Open ryhisner opened 1 month ago

ryhisner commented 1 month ago

Description Sub-lineage of: KP.2.3 Earliest sequence: 2024-5-7, Italy — EPI_ISL_19198507 Most recent sequence: 2024-6-28, USA, California — EPI_ISL_19242617 Continents circulating: North America (10), Europe (1) Countries circulating: USA (10), Italy (1) Number of Sequences: 11 GISAID Nucleotide Query: C2536T, C29666A or G22026A, C29666A CovSpectrum Query: Nextcladepangolineage: KP.2.3* & C2536T & [1-of: G22026A, C29666A] Substitutions on top of KP.2.3: Spike: S155N Nucleotide: C2536T, G22026A, C29666A

USHER Tree https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/KP.2.3_S155N.json?c=gt-S_155&gmax=25384&gmin=21563&label=id:node_7032356

image

Evidence S:S155N would not normally form a glycan, but it does in the context of S:F157S. The S155N + F157S glycan-creating combo has only appeared in 22 sequence ever. Eleven of those are in this lineage, and nine others are in various JN.1* lineages. The remaining two come from a couple impressive BA.2 chronic-infection sequences collected in May in Alberta, Canada.

As you can see, 10 of the 11 sequences here also have N:G200D. 9/11 seqs are from California, one from Ohio, and one from Italy.

Genomes

Genomes EPI_ISL_19198507, EPI_ISL_19212260, EPI_ISL_19212280, EPI_ISL_19212282, EPI_ISL_19242617, EPI_ISL_19260851, EPI_ISL_19260856, EPI_ISL_19273257, EPI_ISL_19273390, EPI_ISL_19273392-19273393
aviczhl2 commented 1 month ago

Are there some simple criteria for Glycans so that I can merge it with my automated lineage analyse tool?

FedeGueli commented 1 month ago

Are there some simple criteria for Glycans so that I can merge it with my automated lineage analyse tool?

N-all but P- S/T predicts a N Glycan but it needs to be in an exposed area of the spike.

aviczhl2 commented 1 month ago

Are there some simple criteria for Glycans so that I can merge it with my automated lineage analyse tool?

N-all but P- S/T predicts a N Glycan but it needs to be in an exposed area of the spike.

Do you mean three neighboring spike proteins being 1=N 2=all but P 3=S/T

What is an "exposed area"?

FedeGueli commented 1 month ago

Are there some simple criteria for Glycans so that I can merge it with my automated lineage analyse tool?

N-all but P- S/T predicts a N Glycan but it needs to be in an exposed area of the spike.

Do you mean three neighboring spike proteins being 1=N 2=all but P 3=S/T

What is an "exposed area"?

N-X(not P)-S or T

Exposed i mean not internal but on the surface (on a loop on a stem, not buried)

aviczhl2 commented 1 month ago

Are there some simple criteria for Glycans so that I can merge it with my automated lineage analyse tool?

N-all but P- S/T predicts a N Glycan but it needs to be in an exposed area of the spike.

Do you mean three neighboring spike proteins being 1=N 2=all but P 3=S/T What is an "exposed area"?

N-X(not P)-S or T

Exposed i mean not internal but on the surface (on a loop on a stem, not buried)

Which positions are on the surface and which are buried? Do I need to implement something like AlphaFold to find them?

aviczhl2 commented 1 month ago

Exposed i mean not internal but on the surface (on a loop on a stem, not buried)

I guess the best way is to find a table for "on the surface" positions of major variants. Where can I find such a table?

FedeGueli commented 1 month ago

+1 California

FedeGueli commented 1 month ago

16 but only one from July closing it then

aviczhl2 commented 2 weeks ago

25 now

FedeGueli commented 2 weeks ago

3 recent samples, reopening for a while , likely not fast.