cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.05k stars 98 forks source link

BA.2.38 sublineage defined by S:K444N, Orf1a:T1543I, Orf1a:N3725S, Orf1a:T4355I and S:F157S circulating in India (142 sequences) #828

Closed FedeGueli closed 2 years ago

FedeGueli commented 2 years ago

As suggested by @chrisruis in #814 i open this issue to propose separately a BA.2.38 sublineage that seems to be part of the diversity sequenced in India during the last weeks.

Credits to : @c19850727 who spotted this first and monitored it since.

This sublineage is a sibling lineage of the one proposed in #746 by @sinickle , as explained very well by him there, we have seen and tracked multiple BA.2 sublineages with a double S mutation S:K417T+S:K444N but the one that seems more relevant was this one: BA.2.38 (=BA.2+25416T+S:417T) sublineage with T7153C, G22894C (S:K444N) as it is easy to observe in this tree: Schermata 2022-07-07 alle 23 33 56 It is splitted in two branches: In the upper part of the tree there is #746 defined by Orf1b:D51N In the lower part there is the branch that i want to propose here defined by Orf1a:T1543I (C4893T ) in NSP3 and Orf1a:N3725S (A11439G) in NSP6. This branch acquired C17373T then orf1a:T4355I (C13329T) and finally S:F157S (T22032C) from which i suggest to start the lineage if it will be designated. Usher tree: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_268ca_74e160.json?branchLabel=aa%20mutations&c=country&label=nuc%20mutations:C4893T,A11439G Schermata 2022-07-07 alle 23 45 15

Number of sequences: 26 Sequences list (the only 12 carrying all the defining mutations): contributors (1).csv Countries: 22 out of 26 sequences are from India plus 2 earlier in May from Nepal (with an additional substitution S:A264S) 1 from Canada and 1 from Japan likely from airport surveillance. Apparent regional prevalence in India (past 2 months) Schermata 2022-07-07 alle 23 54 01

Covspectrum Overview for India: https://cov-spectrum.org/explore/India/AllSamples/Past6M/variants?aaMutations=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S&nucMutations=T7153C%2CC17373T&

Growth advantage versus BA.2.38 : Schermata 2022-07-07 alle 23 58 02 https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?aaMutations=S%3A417T&nucMutations=25416T&aaMutations1=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S%2CS%3A417T&nucMutations1=T7153C%2CC17373T%2C25416T%2CG22894C&analysisMode=CompareToBaseline&

Growth disadvantage versus BA.5 baseline: Schermata 2022-07-08 alle 00 00 05 https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?aaMutations=M%3A3N&nucMutations=12160A&aaMutations1=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S%2CS%3A417T&nucMutations1=T7153C%2CC17373T%2C25416T%2CG22894C&analysisMode=CompareToBaseline&

FedeGueli commented 2 years ago

49 sequences as today

FedeGueli commented 2 years ago

63 Sequences as today. Now it has been found in 7 countries from 3 continents: newly found in South Korea, Canada , Usa , Belgium ) Schermata 2022-07-15 alle 14 42 33

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_e16f_15d990.json?branchLabel=aa%20mutations&c=country&label=nuc%20mutations:T22032C

@chrisruis

thx to @bitbyte2015 for highlighting its growth.

FedeGueli commented 2 years ago

Query updated to NCPL BA.5 baseline: -17% https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?variantQuery=NextCladePangoLineage%3ABA.5*&aaMutations1=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S&nucMutations1=C17373T%2CG22894C&analysisMode=CompareToBaseline&

silcn commented 2 years ago

This lineage has a branch with S:146del that is concentrated in Assam, India, where it makes up 52 of the most recent batch of 81 samples, dated 06-20 to 07-01 and received 07-20. Prior to this, sequencing in Assam had been virtually non-existent since February. In addition there are 2 sequences from other Indian states, 2 from the US and 1 from Germany, making 57 in total.

The Usher tree ignores deletions so S:146del doesn't form its own branch; however, it almost does, because all but one of the sequences also have ORF1a:G445V and nuc:C2623T, and that branch is shown in blue on the tree below. The other sequence with S:146del is shown in yellow.

BA238_146del

https://nextstrain.org/fetch/github.com/silcn/subtreeAuspice1/raw/main/auspice/subtreeAuspice1_genome_5840_88f3b0.json?branchLabel=Spike%20mutations&c=gt-nuc_1599,25047&label=nuc%20mutations:C13329T

FedeGueli commented 2 years ago

Thx @Silcn great catch of that branch of it. It is a big jump to 130seqs today.

@chrisruis @corneliusroemer @AngieHinrichs
at this point a designation of the last 3 big BA.2.38 undesignated sublineages (this one , the apparent sibling in #746 and the other with S:478R in #840 is well deserved and it will help indian scientists to monitor themselves what is going on there.

FedeGueli commented 2 years ago

142 sequences as today.

For the first time this lineage shows a growth advantage versus BA.5. Baseline in India: +14% https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?aaMutations=M%3A3N&nucMutations=12160A&aaMutations1=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S&nucMutations1=T7153C%2CC17373T%2C25416T&analysisMode=CompareToBaseline& Schermata 2022-07-23 alle 09 29 29

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3844a_b9f920.json?branchLabel=aa%20mutations&c=pango_lineage_usher&label=nuc%20mutations:T22032C

AngieHinrichs commented 2 years ago

Seen in Belgium, USA and Germany in July:

Belgium/UZA-UA-CV8522008253/2022|EPI_ISL_13694971|2022-07-02 USA/NJ-CDC-QDX38676038/2022|EPI_ISL_14003738|2022-07-03 USA/TX-HMH-MCoV-105165/2022|EPI_ISL_13891446|2022-07-06 Germany/BW-RKI-I-910479/2022|EPI_ISL_13943369|2022-07-12

FedeGueli commented 2 years ago

Thanks @AngieHinrichs notably this sublineage has just 11% of disadvantage versus BA.2.75 baseline in India comparable to the BA.4 disadvantage vs BA.5 .

https://cov-spectrum.org/explore/India/AllSamples/Past2M/variants?variantQuery=NextCladePangoLineage%3ABA.2.75*&aaMutations1=ORF1a%3AN3725S%2Corf1a%3AT1543I%2CS%3A157S&nucMutations1=T7153C%2CC17373T%2C25416T&analysisMode=CompareToBaseline& I suggest a designation of this also considering a sublineage of it gained Orf1a:A2529V (defining mutation of AY.4 that boosted its trasmissibility vs original B.1.617.2)

InfrPopGen commented 2 years ago

Thanks for submitting. We've added lineage BA.2.38.2 with 55 newly designated sequences, and 2 updated designations from BA.2.38. Defining mutation(s) T22032C (S:F157S), preceded by C13329T (Orf1a:T4355I) and C17373T.

romiwahengbam commented 2 years ago

@InfrPopGen @FedeGueli @silcn In our last run of 96 samples from Assam, IN, today we found 68 more sequences designated as BA.2.38 among samples from July 2022, but with S:F157S and couple of mutations in Orf1a. This mutation is currently dominating our samples. image

ryhisner commented 2 years ago

Wanted to point out a very interesting branch in this lineage that, in addition to spike mutations F157S, K417T, and K444N, also has A264S and L244F/H245del (or L244del/H245F, as NextClade interprets it). A GISAID search for Spike_F157S, Spike_A264S, Spike_K444N returns 11 sequences, while Spike_F157S, Spike_A264S, Spike_K444N, Spike_L244F returns nine. Six of these 11 sequences were uploaded either today or yesterday: EPI_ISL_14162147, EPI_ISL_14176006, EPI_ISL_14192615, EPI_ISL_14192617, EPI_ISL_14192709, EPI_ISL_14196580

Strangely, the first two sequences in this branch (EPI_ISL_13388893, EPI_ISL_13388894) were collected on 2022-5-23, both from Nepal. Then nothing more until July, and now six uploaded in the past two days.

The Usher tree contains 15 sequences. https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons/main/subtreeAuspice1_genome_3ade4_983f10--F157S%2CL244F%2CA264S%2CK417T%2CK444N.json Screen Shot 2022-08-02 at 4 56 42 PM Screen Shot 2022-08-02 at 5 04 46 PM

silcn commented 2 years ago

@ryhisner I was watching this one too. I'm tempted to believe that the order is LH244-245F and then A264S, owing to the existence of a single sequence (EPI_ISL_14166895) that has the first but not the second. As you might have noticed, the sequences with A264S but not LH244-245F are also missing 24-26del, so there's some issue with the sequencing that's causing deletions to be missed.

FedeGueli commented 2 years ago

@ryhisner @silcn the S:264S + S:245F branch now counts 19 sequences and 5 countries (indonesia added) https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1b93d_ff2590.json?branchLabel=aa%20mutations&c=country&label=nuc%20mutations:G22352T Schermata 2022-08-07 alle 19 36 33

I think it deserves a proposal.

(btw big jump today for the BA.2.38.2 lineage to 315 sequences, one of the cases where initial growth rate has been underestimated)

FedeGueli commented 2 years ago

@silcn @ryhisner two new sequences of the cluster with S:A264S and Spike H245del (on gisaid) both from Singapore and very recent (collection date end of July early august) EPI_ISL_14358975, EPI_ISL_14353705. I think maybe it is worth proposing it

FedeGueli commented 2 years ago

@silcn @ryhisner one more sequence of yur S:264S sublineage with S:244del and 245F. From Australia. 27 sequences as today (keeping as valid the three messy ones in the upper part of the tree): https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_19747_15a2a0.json?branchLabel=aa%20mutations&c=pango_lineage_usher&label=nuc%20mutations:G22352T

corneliusroemer commented 2 years ago

Feel free to propose @FedeGueli looks like a useful one to have there - better than discussing here

FedeGueli commented 2 years ago

Thx @corneliusroemer i will!