cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

North America sub-lineages of B.1.617.2 with Spike A222V, V1264L, and often N1074S #188

Closed zach-hensel closed 3 years ago

zach-hensel commented 3 years ago

North America sub-lineages of B.1.617.2 with Spike A222V, V1264L, often N1074S, and sometimes additional Spike mutations

by Zach Hensel

Description

Sub-lineages of: B.1.617.2 (97% currently classified as B.1.617.2 on outbreak.info; others classified as various AY.X lineages)

Earliest sequence: 3 April 2021 (EPI_ISL_3275112)

Most recent sequence: 11 August 2021 (EPI_ISL_3396753)

Countries circulating (no. sequences): 23 Countries; mainly USA (3,307) and Mexico (682)

USA states circulating (no. sequences): 50 states; CA (1,132) and over 100 in AR, TX, WA, NY, FL

Potential significance: Description of Delta sub-lineages accruing Spike mutations circulating mainly in North America. It is likely that something similar could be described elsewhere in the world. This was found while looking for sub-lineages similar to those described in issue #175. These sequences fall within the fraction of Delta that has added S:A222V and Nsp6:T181I.

Here, the Spike V1264L mutation appears to have occurred after the ORF6:K48N mutation, so this search string was used for GISAID: "NS6_K48N, Spike_A222V, Spike_V1264L, Spike_T19R, Spike_T478K, Spike_L452R, Spike_P681R" returning 4,234 sequences. Data above is from the equivalent outbreak.info link as it appears today:

https://outbreak.info/situation-reports?pango&muts=ORF6%3AK48N&muts=S%3AA222V&muts=S%3AV1264L&muts=S%3AT19R&muts=S%3AT478K&muts=S%3AL452R&muts=S%3AP681R

Within this sub-lineage, I identified additional Spike mutations; this is not an exhaustive list (attached list of GISAID genomes including these mutations):

All 5 of these sub-lineages were checked on UShER and appear to be single events (not always placed together, but plausible that they could be).

Genomes: Attached (all.txt) are GISAID accessions for the above sub-lineages extracted with the above search string. It appears that there are additional genomes available beyond what passes this filter on GISAID in the UShER tree.

This image shows the NextStrain North-America-focused subsample, where ORF6:K48N and S:V1264L root these sub-lineages. image

Proposed lineage name: AY.A.B.C... I propose a hierarchy based upon the order of addition of Spike mutations on top of the diversity of Delta sub-lineages already existing circa early April. In this case, AY.X for the additional of S:V1264L in the presence of ORF6:K48N, AY.X.1 after adding N1074S, and AY.X.1.1 through AY.X.1.5 for additional Spike mutations. No name is proposed for the earlier S:A222V mutation as other AY lineages are already assigned subsequent to this S:A222V event.

chrisruis commented 3 years ago

Thanks @zach-hensel, as outlined here, we're designating AY lineages that are epidemiologically distinct. It's not clear that the clades you propose are epidemiologically distinct. However, they are part of a larger clade of USA and Mexico sequences that is the result of one or more introductions into this region with significant onward transmission. This larger clade does warrant a new AY lineage and we've added this as AY.26 in v1.2.68

We've started AY.26 on a branch with A27345T (Orf6:K48N) and it contains 5052 sequence designations in v1.2.68: AY.26_sequences.txt

zach-hensel commented 3 years ago

Yes, I think it will be difficult to pinpoint anything here within ORF6:K48N other than "somewhere in USA or Mexico" and that this will hopefully be true of most new Delta sub-lineages unless they happen to coincide with a major transmission event or happen in one of the few places in the world without much Delta.

I don't have time to vet or submit, but maybe check this out if it's not designated yet S:T77A + S:E484Q + Nsp6:H11Q (ORF1a:H3580Q) sub-lineage including for example EPI_ISL_3219349 -- S:T77A and Nsp6:H11Q look to be imported from India with S:E484Q happening in California judging from locations up and down the tree. Like other things, there's no obvious advantage yet over other Delta.

A recent UShER tree is here - https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_631ef_51ac10.json?c=gt-S_77,484&label=nuc%20mutations:T22032G,C22033T,A22034G

FedeGueli commented 3 years ago

Hi @zach-hensel

I've checked the proposed lineages beside the designated one, it doesnt look to me that there is ongoing spread of any of them. even if point mutation orf1a H3580Q seems it has been circulating in Florida for a while.