cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

2 probable sub lineages of B.1.617.2 (Delta) responsible for majority of the samples in Indonesia #175

Closed shay671 closed 3 years ago

shay671 commented 3 years ago

Out of 477 last samples from Indonesia uploaded to GISAID (from 23.06 until today) 442 are Delta. From those Delta samples, 316 are from the first lineage described here, 123 are from the second and only 3 are not belong to those two.

First lineage -
Unique mutations * : NSP2_P200L,NSP3_P1228L,Spike_V1264L Mainly circulates in : Singapore (1254),Indonesia (942), Australia (101), South Korea (84). First sequence in Indonesia, March 2021.

Usher analysis https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/singleSubtreeAuspice_genome_958c_19a100.json?label=nuc%20mutations:C1404T,C4752T,G25352T

Gisaid ref list First lineage (Clade D based) GISAID ref.xlsx

Second lineage- Unique mutations * : ORF3a:P104S, S:T778T (Silent mutation, c23896t). Mainly circulates in : Indonesia (259),Germany (65), Japan (44), France (43) First sequence in New Zealand, March 2021.

Usher analysis https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/singleSubtreeAuspice_genome_3b25f_252300.json?label=nuc%20mutations:C25702T

Gisaid ref list Second lineage (clade A based) GISAID ref.xlsx

shay671 commented 3 years ago

What's interesting is that the share of the first lineage out of the general clade D it evolved from is rising in Singapore and Indonesia.

iNDO

This pattern was not seen when tracking the share of AY.3 out of general Delta clade D (which it also been evolved from) in the main US states it's abundant in and also in the UK

zach-hensel commented 3 years ago

In the CoVariants Delta subsample right now, Singapore has extinguished every Delta sub-lineage except for this one with the only other non-travel recent sequence being B.1.351 (caveat re: subsample). Singapore sequences were dominated not long ago by a Delta sub-lineage with N:del214/215 but I don't infer much from that being dominant or being extinguished in a country with stochastic outbreaks. However, that V1264L is impossible to dismiss showing up in almost 2% of Delta sequences now and with some independent evidence of significance.

~56 sequences in your first lineage recently have added S:A1070V, presumably emerging within Singapore's Delta outbreak. This isn't so common, but might be somewhat significant with N1074 and A1078 mutated on occasion in the same area in Delta. Also, there are a number of additional Spike mutations in the Indonesia sequences, mainly in the first lineage; with some appearing hundreds of times elsewhere in Delta mutated at the same site.

The other V1264L branches are probably interesting as well; in general there are many variants out there with some plausible degree of immune escape on top of Delta at large. Other V1264L include two with S:A222V one (recently adding N1074S; I edited this to remove an assumption of where it originated). Perhaps there's a systematic way to catalog these rather than finding them one by one.

Here are some outbreak.info links showing the V1264L (ORF6:K48N + S:A222V + S:V1264L) - https://outbreak.info/situation-reports?pango&muts=ORF6%3AK48N&muts=S%3AA222V&muts=S%3AV1264L - and the branch from that adding S:N1074S - https://outbreak.info/situation-reports?pango&muts=ORF6%3AK48N&muts=S%3AA222V&muts=S%3AV1264L&muts=S%3AN1074S

shay671 commented 3 years ago

Regarding the A222V mutation, It's actually a defining mutation of the clade A of delta (as analyzed in the paper i referenced). But i think that the first lineage here is really the thing to check as it is evolved from clade D. Clade D is taking over worldwide.

You wondered if there is a systematic way for catalog and find all these variants, well i actually have some kind of method. It's not based on phylogenetic trees but rather use them to verify the results. This is the method in the base of the paper i referenced, and the way i found B.1.631 and B.1.636. But, unfortunately i lack coding skills so i actually do this manually with excel sheets.

chrisruis commented 3 years ago

Thanks @shay671. We've added your lineage 1 as AY.23 and your lineage 2 as AY.24 in v1.2.61

shay671 commented 3 years ago

Correction regarding AY.23 :

The mutation NSP3_P1228L is not a defining mutation of AY.23 but rather of clade D of delta from which it was evolved (it it defining it but not it's unique consensus). But - NSP3_T678I is a mutation of the unique consensus of AY.23 and was not mentioned in the first report i gave.

@chrisruis

chrisruis commented 3 years ago

Thanks @shay671. I think NSP3_T678I is nucleotide mutation C4752T so this occurs on the branch where we started AY.23, as this coincides with the introduction(s) into Indonesia/Singapore/Japan/South Korea.

We've now got a summary of the AY lineages including their defining mutations here

shay671 commented 3 years ago

That's a great summary @chrisruis , Thank you Regarding AY.12- It's mentioned there that it has S:T791I as a defining mutation. But the signature of this Israeli clade includes more mutations :

First there are the 11 mutations of the Delta clade it is part of (and which is taking over the world), But let's put that aside.

Than you got 5 mutations which accumulated in a stepwise order, ill mention them here in a ascending order according to their appearance in the phylogenetic tree :

G1048T - NSP2:K81N

C19983T - NSP15:V121V

G20937t - NSP16:T93T

C23934T - S:T791I

C25516T - ORF3a:P42S (It was accumulated after S:T791I, but as AY.12 designated as an Israeli lineage, This last mutation is in the consensus of the Israeli part of this broader clade)

I think this specification is important as it focus to thia lineage and not other unrelated B.1.617.2 lineages with S:T791I.

chrisruis commented 3 years ago

Thanks @shay671. The mutations listed for each lineage in that table are just those that are acquired along the branch immediately before the common ancestor of the lineage, i.e. on the branch where we've started the lineage. These mutations therefore separate the lineage from other closely related sequences and we'd expect that at least most sequences assigned to an AY lineage would have the listed mutations. But they aren't the only mutations that have occurred since the root of the Delta clade and there will be more mutations separating each of the AY lineages from more distant B.1.617.2 sequences and other AY lineages.

shay671 commented 3 years ago

I run today some random samples based on GISAID of the 2 variants in USHER. There seems to be some misassignment of those 2.

Here is for AY.23 : image

And here is for AY.24 : image