sars-cov-2-variants / lineage-proposals

Repository to propose and discuss lineages
43 stars 2 forks source link

GE.1.2 with S:A376S, S:R478T, ORF3a:S195C + ORF7a-7b-8 Deletion (≥27 seq, Kenya, England, USA, Belgium; Dec 11) #1103

Closed ryhisner closed 10 months ago

ryhisner commented 11 months ago

Description

Sub-lineage of: GE.1.2 (XBB.2.3.10.1.2 = XBB.2.3 + S:N148T, ∆N185, F186I, K478R + ORF1a:G661S, P1921S, V4369A + ORF7a:P99S) Earliest sequence: 2023-9-13, Belgium 2nd Earliest sequence: 2023-10-12, Kenya Most recent sequence: 2023-11-15, England Continents circulating: Europe (7), Africa (3), North America (2) Countries circulating: England (6), Kenya (3), USA (2), Belgium (1) Number of Sequences: 11 GISAID Nucleotide Query: No perfect ones. The one below catches them all but also picks up a sequence from Kenya in June that is impossible to exclude due to missing coverage in many sequences. G2246A, C6026T, C9344T, -A27383T, -C27431T, -C27688T, -C27807T, -G27915T CovSpectrum Query: Nextcladepangolineage: GE.1* & [1-of: A21791G, G22688T, C25976G] & [exactly-0-of: C25546T, A27383T, C27431T, C27688T, C27807T, G27915T] Substitutions, Deletions on top of GE.1.2: Spike: A376S, R478T (reversion) ORF3a: S195C ORF7a: Entirely Deleted ORF7b: Entirely Deleted ORF8: Entirely Deleted Nucleotide: G22688T, G22995C, C25976G, ∆27384-28250 (estimate, approximate)

USHER Tree https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/EG.1.2_ORF7a-ORF7b-ORF8_Deleted.json?c=gt-ORF3a_195&gmax=26220&gmin=25393&label=id:node_4033831

image

Evidence This lineage appears to have the same massive deletion that is in GW.5.1.1. It’s impossible to know the exact location of the deletion, but it seems certain that all or nearly all of ORF7a, ORF7b, and ORF8 are deleted. Hopefully, some of the professional sequencers, who have access to the raw reads, will be able to definitively determine the exact extent of the deletion.

This lineage also has two mutations in spike that represent the third or fourth AA residue at that location—S:A376S (T→A→S) and S:R478T (T→K→R→T). In addition to these, 7/9 sequences also have S:K77R, including all six sequences collected in England.

ORF3a:S195C is also very interesting as it involves a C->G nuc mutation, which is the rarest type.

Genomes

Genomes EPI_ISL_18299717, EPI_ISL_18466699, EPI_ISL_18466703, EPI_ISL_18497347, EPI_ISL_18499668, EPI_ISL_18509120, EPI_ISL_18516334, EPI_ISL_18516362, EPI_ISL_18516363, EPI_ISL_18525514, EPI_ISL_18528814, EPI_ISL_18528815,

EDITED: Alternative query proposed by mod: C25976G ,-C22995G,C6026T catches 12 as 21/11

Genomes alternative query EPI_ISL_18466699, EPI_ISL_18466703, EPI_ISL_18497347, EPI_ISL_18499668, EPI_ISL_18509120, EPI_ISL_18516318, EPI_ISL_18516334, EPI_ISL_18516362, EPI_ISL_18525514, EPI_ISL_18528814-18528815, EPI_ISL_18529633

Apparently excluding a couple of recombinants . The presence of recombinants may hint to high prevalence in Kenya or parts of it

_

FedeGueli commented 11 months ago

Now i see 13 seqs. (excluding the one you reported as unrelated)

FedeGueli commented 11 months ago

C25976G ,-C22995G,C6026T catches these 12 sequences including one from Sweden uploaded yesterday :

EPI_ISL_18466699, EPI_ISL_18466703, EPI_ISL_18497347, EPI_ISL_18499668, EPI_ISL_18509120, EPI_ISL_18516318, EPI_ISL_18516334, EPI_ISL_18516362, EPI_ISL_18525514, EPI_ISL_18528814-18528815, EPI_ISL_18529633

with EPI_ISL_18516318 and EPI_ISL_18529633 NOT caught by your query , but caught by mine but leaving out EPI_ISL_18299717 (Belgium 9/13) EPI_ISL_18516363 (Kenya Nov 6) caught by your query and not by mine

I have to say that EPI_ISL_18516363 is likely a recombinant with EG.5.1 > Schermata 2023-11-22 alle 11 11 29 Also the belgian one looks like a recombinant with some different lineage not belonging to XBB ! Schermata 2023-11-22 alle 11 16 09

here the tree of my query Schermata 2023-11-22 alle 11 06 26 https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_3cf3d_dd15b0.json?c=userOrOld&label=id:node_4036940

So If you agree i would use my query that catches real examples of this lineage excluding ( totally unwillingly!) putatitve recombinants

FedeGueli commented 11 months ago

and PS this is not slow at all : https://cov-spectrum.org/explore/World/AllSamples/Past2M/variants?nextcladePangoLineage=JN.1*&nucMutations1=C25976G%2CC6026T%2CC22995C&analysisMode=CompareToBaseline&

Schermata 2023-11-22 alle 11 18 41

FedeGueli commented 10 months ago

19 now i suggest a designation of this one. @corneliusroemer

ryhisner commented 10 months ago

Six new sequences today, five from USA (4 from Minnesota, 1 from Iowa), and one from England. All on the S:K77R branch, which is the dominant one outside of Kenya.

FedeGueli commented 10 months ago

Designated GE.1.2.1 via https://github.com/cov-lineages/pango-designation/commit/bde0910d55d5472e54c6955230a7e6708bda4d5e

xz-keg commented 9 months ago

+3 for the recombinant lineage here. @FedeGueli you can make a separate issue. Total count is 6.

https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_4a3d5_965600.json?gt=nuc.2506C&label=id:node_4110624

FedeGueli commented 9 months ago

+3 for the recombinant lineage here. @FedeGueli you can make a separate issue. Total count is 6.

https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_4a3d5_965600.json?gt=nuc.2506C&label=id:node_4110624

please do it yourself, i just recognized the first one nothing more.

FedeGueli commented 9 months ago

+3 for the recombinant lineage here. @FedeGueli you can make a separate issue. Total count is 6. https://nextstrain.org/fetch/genome-test.gi.ucsc.edu/trash/ct/subtreeAuspice1_genome_test_4a3d5_965600.json?gt=nuc.2506C&label=id:node_4110624

please do it yourself, i just recognized the first one nothing more.

i cant find a query to include last sample from Kenya. the best one is G27915T, C25546T, C26340T