cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

A new sub-lineage of AY.3 concentrated in Mississippi, USA #147

Closed zach-hensel closed 3 years ago

zach-hensel commented 3 years ago

A new sub-lineage of AY.3 concentrated in Mississippi, USA

by Ashley Robinson and Zach Hensel

Description

Sub-lineage of: AY.3, which is a sub-lineage of B.1.617.2

Earliest sequence: 29 May 2021, Mississippi, USA.

Most recent sequence: 8 July 2021, Alabama, USA.

Countries circulating (no. sequences): USA (152)

USA states circulating (no. sequences): MS (142), PA (3), LA (2), MA (2), AL (1), GA (1), TN (1)

Potential significance: AY.3 is the main variant being sequenced weekly in Mississippi (>70% of all recent sequences), most falling within this sub-lineage of AY.3 that is distinguished by the mutations described below (>60% of all recent sequences). For example, of 91 Pangolin-typeable sequences generated on 13 July 2021 by Dr. Robinson's team, from broad sampling across Mississippi, 68 are AY.3 and 13 are unclassified B.1.617.2. Of those 81 Delta sequences, 60 are this new sub-lineage (those sequences are not yet in GISAID and are thus not included in the counts above for the USA or MS). Thus, there is clear epidemiological relevance of this sub-lineage in a region of the USA.

Genomes: Attached is the GISAID accessions, sampling locations and dates for 152 sequences [AY.3.1.csv.txt]

Evidence

Identification method: The GISAID database was searched on 13 July 2021 for the ORF1a:I3731V mutation that identifies the AY.3 Delta variant, which returned 1118 sequences. Sequences with >5% N's and sequences outside the Delta lineages (freshly determined with the latest Pangolin build) were removed. The latest NextStrain build was used for alignment and phylogenetic inference. The tree below shows AY.3 and its new sub-lineage from this analysis.

image

Defined by: ORF1a:13731V (NSP6_I162V) and ORF1a:T3646A (NSP6_T77A) that are common to AY.3, plus the ORF1a:D1127Y (NSP3_D309Y) and ORF1b:H1550Y (NSP14_H26Y) that define this sub-lineage. 5 SNPs also define the sub-lineage, including G3644T, C18115T, C20199T, T23284C, C25339T. Both defining amino acid mutations map to charged, solvent-exposed residues (Nsp3 macrodomain; Nsp14 ExoN domain) and could impact protein-protein interaction. Nsp3_D309Y is rarely found elsewhere. Nsp14_H26Y is more common, found in more that 20% of sequences in lineages B.1.177.66, C.3, AY.3, A.2.5, AA.1, and B.1.214.3.

image Figure 1. Nsp14_H26 is solvent-exposed and maps to the Nsp10:Nsp14 interface in PDB 7DIY.

image Figure 2. Nsp3_D309 is in a solvent-exposed loop in the macrodomain (PDB 7KQP).

Track the four ORF1ab mutations defining this sub-lineage on Outbreak.info: https://outbreak.info/situation-reports?muts=orf1a%3AT3646A&muts=orf1a%3AI3731V&muts=orf1a%3AD1127Y&muts=orf1b%3AH1550Y&loc=USA&selected=USA

Proposed lineage name: AY.3.1 (B.1.617.2.3.1)

chrisruis commented 3 years ago

Thanks for submitting @zach-hensel. We've added this as AY.3.1 in v1.2.42. There's 197 sequences designated in this lineage. We've also added 571 new AY.3 sequence designations based on looking at this and your issue #142.

12 of the proposed AY.3.1 sequences seemed to cluster apart in the latest UShER tree (USA/MS-UMMC-M611A8-504986/2021, USA/MS-UMMC-X9G1-210999023629/2021, USA/MS-UMMC-X9F1-210999023628/2021, USA/MS-UMMC-M613G3-504988/2021, USA/MS-UMMC-M610A2-504985/2021, USA/MS-UMMC-M612C4-504987/2021, USA/MS-UMMC-X9A3-210999023730/2021, USA/MS-UMMC-X9H1-210999023630/2021, USA/MS-UMMC-S465A2-505005/2021, USA/MS-UMMC-X9C2-210999023633/2021, USA/MS-UMMC-X9C3-210999023732/2021 and USA/MS-UMMC-X9B2-210999023632/2021). They seem to have a slightly different mutation profile so we didn't include these within AY.3.1.