cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 97 forks source link

BA.5.2.1 sublineage with S:A1020S (1722 seqs as of 27.6.22.) #721

Closed agamedilab closed 2 years ago

agamedilab commented 2 years ago

Proposal for a sublineage of BA.5.2.1 Earliest sequence: 04.02.2022 (South Africa) Countries detected: mainly Israel (67%), Australia, Austria, Belgium, Denmark, India, Portugal, Singapore, South Africa, United Kingdom, USA

Defining mutations: G24620T = S:A1020S

This variant is defined by S:A1020S and was first detected in South Africa and multiple other countries. At the present it is most common in Israel where it currently represents c. 24 % (78/325) of BA.5 genomes.

This variant is apparently a sublineage of BA.5.2.1 (but see also #717).

pic pic2 https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_33558_efa7c0.json?branchLabel=aa%20mutations&c=gt-nuc_24620&label=nuc%20mutations:C1627T

EPI_ISL_11903064, EPI_ISL_12097392, EPI_ISL_12278983, EPI_ISL_12278990, EPI_ISL_12456712, EPI_ISL_12473610, EPI_ISL_12559423, EPI_ISL_12559434, EPI_ISL_12559446, EPI_ISL_12607998, EPI_ISL_12694997, EPI_ISL_12695010, EPI_ISL_12705881, EPI_ISL_12759185, EPI_ISL_12763724, EPI_ISL_12763757, EPI_ISL_12845543, EPI_ISL_12845547, EPI_ISL_12845583, EPI_ISL_12845589, EPI_ISL_12861930, EPI_ISL_12875231, EPI_ISL_12903709, EPI_ISL_12914704, EPI_ISL_12915528, EPI_ISL_12915774, EPI_ISL_12915820, EPI_ISL_12925237, EPI_ISL_12953251, EPI_ISL_13019037, EPI_ISL_13047773, EPI_ISL_13047917, EPI_ISL_13049080, EPI_ISL_13058644, EPI_ISL_13066294, EPI_ISL_13068948, EPI_ISL_13072443, EPI_ISL_13072483, EPI_ISL_13072572, EPI_ISL_13072583, EPI_ISL_13072591, EPI_ISL_13072672, EPI_ISL_13072690, EPI_ISL_13072736, EPI_ISL_13072778, EPI_ISL_13073647, EPI_ISL_13073662, EPI_ISL_13073667, EPI_ISL_13073708, EPI_ISL_13073876, EPI_ISL_13073930, EPI_ISL_13073941, EPI_ISL_13074938, EPI_ISL_13075123, EPI_ISL_13075192, EPI_ISL_13077156, EPI_ISL_13077159, EPI_ISL_13077162, EPI_ISL_13077164, EPI_ISL_13077166, EPI_ISL_13077167, EPI_ISL_13077168, EPI_ISL_13077169, EPI_ISL_13077173, EPI_ISL_13077175, EPI_ISL_13077177, EPI_ISL_13077180, EPI_ISL_13077287, EPI_ISL_13077297, EPI_ISL_13077308, EPI_ISL_13077330, EPI_ISL_13079959, EPI_ISL_13089823, EPI_ISL_13094123, EPI_ISL_13094128, EPI_ISL_13102634, EPI_ISL_13102658, EPI_ISL_13106520, EPI_ISL_13109939, EPI_ISL_13110800, EPI_ISL_13110809, EPI_ISL_13110815, EPI_ISL_13110820, EPI_ISL_13110849, EPI_ISL_13110855, EPI_ISL_13110857, EPI_ISL_13110869, EPI_ISL_13110871, EPI_ISL_13110881, EPI_ISL_13110886, EPI_ISL_13110908, EPI_ISL_13110914, EPI_ISL_13110920, EPI_ISL_13110926, EPI_ISL_13110953, EPI_ISL_13111068, EPI_ISL_13111098, EPI_ISL_13111117, EPI_ISL_13111167, EPI_ISL_13111193, EPI_ISL_13111197, EPI_ISL_13111213, EPI_ISL_13111302, EPI_ISL_13111365, EPI_ISL_13111439, EPI_ISL_13111484, EPI_ISL_13111531, EPI_ISL_13111554, EPI_ISL_13111625, EPI_ISL_13111967, EPI_ISL_13111977, EPI_ISL_13112002, EPI_ISL_13112025, EPI_ISL_13112057, EPI_ISL_13113385, EPI_ISL_13113396, EPI_ISL_13127756

chrisruis commented 2 years ago

Thanks @agamedilab As outlined here, new Pango lineages need to be associated with both an evolutionary event and an epidemiological event. These epidemiological events can include movement of the virus into a new geographical region, rapid and sustained growth in frequency compared to other co-circulating lineages, a jump into a novel host species and acquisition of a set of mutations of particular biological interest. Pango lineages therefore track more than just amino acid mutations

There doesn't seem to be a clear epidemiological event associated with this clade, we therefore haven't designated this at this point

agamedilab commented 2 years ago

Currently there are about 660 sequences according to Usher as of 21.06.22 representing c. 2.4 % of global BA.5 (660/27361).

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_15b19_1c0b50.json?branchLabel=Spike%20mutations&c=pango_lineage_usher&label=nuc%20mutations:G24620T

Growth advantage versus BA.5.2:

image

corneliusroemer commented 2 years ago

@chrisruis I think it would be good to close issues that are not designated as "not planned" rather than as "completed", Github now has two different ways of closing.

chrisruis commented 2 years ago

Thanks Cornelius, agree that sounds good

agamedilab commented 2 years ago

1722 sequences on Usher as of 27.6.22.

grafik https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1664d_97c160.json?branchLabel=aa%20mutations&c=gt-S_1020&label=nuc%20mutations:C241T,T670G,C1627T,C2790T,C3037T,G4184A,C4321T,T6979G,C9344T,A9424G,C9534T,C10029T,C10198T,G10447A,C10449A,G12160A,C12880T,C14408T,C15714T,C17410T,A18163G,C19955T,A20055G,C21618T,T22200G,G22578A,C22674T,T22679C,C22686T,A22688G,G22775A,A22786C,G22813T,T22882G,T22917G,G22992A,C22995A,A23013C,T23018G,A23055G,A23063T,T23075C,A23403G,C23525T,T23599G,C23604A,C23854A,G23948T,A24424T,T24469A,C25000T,C25584T,C26060T,C26270T,G26529A,C26577G,G26709A,A27038G,C27807T,C27889T,A28271T,C28311T,A28330G,G28881A,G28882A,G28883C,A29510C

Advantage versus global BA.5:

grafik https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?pangoLineage=BA.5*&nucMutations1=G24620T&pangoLineage1=BA.5*&analysisMode=CompareToBaseline&

corneliusroemer commented 2 years ago

I made a case for redesignating BF.3 to align with the proposal here, arguments giving over at #805

silcn commented 2 years ago

@agamedilab I'm pretty sure that 56% advantage is a result of changes in sequencing coverage and founder effects. Split up by continent and the advantage becomes much smaller or disappears.

agamedilab commented 2 years ago

<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

  | Advantage versus BA.5* (%) -- | -- Africa | -2 Asia | 14 Europe | 27 North America | 5 Oceania | -12 South America | na

However i am not sure if Covspectrum is calculating/displaying correctly because apparently there are only c. 260 out of 1354 genomes with geographic data:

grafik

silcn commented 2 years ago

@agamedilab for some reason sequences from Israel aren't showing up in that Asia total, but when you search for sequences from Asia you see a total of 1129 sequences which does include Israel. The growth advantage calculation includes the Israel sequences too.