cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

BA.5 with BA.4 N:P151S mutation, plus BA.5.3.1 and private mutations (A17207G and C17304T) #723

Closed JosetteSchoenma closed 2 years ago

JosetteSchoenma commented 2 years ago

Description: A BA.5 with a BA.4 mutation (N:P151S/C28724T), 2 BA.5.3.1 mutations (ORF1a:Q566K/C1931A and N:E136D/G28681D) and 2 private mutations (A17207A and C17304T)

This 17 sample cluster came up, besides 2 other smaller clusters, after Raj Rajnarayanan made me aware of a rising number of 27 samples with both the BA.5 defining M:D3N and the BA.4 defining N:P151S mutation. I will keep an eye on the other 2 as well (1 also has the C1931A, but different private mutations), but this was the biggest one and with possible South African roots.

Earliest sequence: 2022/4/7 (South Africa) Most recent sequence: 2022/5/23 (USA) Countries circulating: South Africa, USA, England Likely breakpoint: between 27889 and 28724 (ORF7b, 9b or N). Cov-spectrum query: C1931A, A17207G, C17304T, G28681T + S:L452R, S:F486V, M:D3N, N:P151S, N:E136D https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?aaMutations=S%3AL452R%2CS%3AF486V%2CM%3AD3N%2CN%3AP151S%2CN%3AE136D&nucMutations=C1931A%2CA17207G%2CC17304T%2CG28681T&aaMutations1=S%3AL452R%2CS%3AF486V%2CM%3AD3N%2CN%3AP151S& Genomes: EPI_ISL_12300640 EPI_ISL_12952671 EPI_ISL_12401049 EPI_ISL_13072836 EPI_ISL_13062966 EPI_ISL_12903587 EPI_ISL_13074450 EPI_ISL_13070487 EPI_ISL_12779656 EPI_ISL_13071274 EPI_ISL_13080440 EPI_ISL_13060978 EPI_ISL_13071206 EPI_ISL_13071719 EPI_ISL_12917323 EPI_ISL_12568930 EPI_ISL_12903544 Evidence: Sc2rf output: ! The first 6 lines are random BA.4 and BA.5 samples as a reference image

I made this table, which hopefully makes it a bit easier to understand how this cluster (new ???, 2nd column from the right) relates to the other variants. image

Nextclade output: (all samples have green QC) image

Usher tree: Notice that the earlier samples are on the right side of the tree. image

agamedilab commented 2 years ago

Based on the table the new variant looks like a BA.5.3.1 that aquired 17207G, 17304T and N:P151S. It think it is difficult to discern whether N:P151S was aquired spontanously or trough recombination with BA.4?

AngieHinrichs commented 2 years ago

Nucleotide 29868 is masked in the UCSC/UShER tree (first 55 and last 100 bases are masked as part of the Problematic Sites set), but anecdotally G29868A appears in a lot of BA.5 sequences (in a set of 287 early BA.5 sequences used to search for common mutations in BA.5, 251 had N at 29868, 33 had alt allele A and 3 had reference allele G, so A in 91% of non-N; not found in a set of 573 early BA.4 sequences). So it is interesting to see some 29868A's in the sequences in @JosetteSchoenma's sc2rf results.

There are also a couple BA.5.3.1 sequences with 17207G and 17304T, but without C28724T/N:P151S, SouthAfrica/NICD-N42661/2022|EPI_ISL_13048619|2022-05-10 and Northern_Ireland/NIRE-01d378/2022|2022-05-03 as shown in the UShER tree image.

So I'm leaning towards 'BA.5.3.1 that acquired C28724T/N:P151S independently'.

JosetteSchoenma commented 2 years ago

Interesting! I checked the smaller clusters I mentioned above, and both have no 29868A. Screenshot_20220608-082718_Twitter.jpgScreenshot_20220608-082616_Twitter.jpg

JosetteSchoenma commented 2 years ago

At least 48 samples now, instead of 17 22 days ago.

4 from last week (1st and 3rd of June) were uploaded from Denmark just now, of which 2 have an R (ambigious nucleotide, mix of A and G, I just learned) instead of G at A17207G. So, M_D3B instead of M_D3N.

JosetteSchoenma commented 2 years ago

There are 63 now. Most of them from the USA (55). It is too early to tell whether it would have a growth advantage over BA.5*. I guess for now is that it's just growing, because BA.5 is growing, but I am keeping an eye on it.Screenshot_20220612-170814_Twitter.jpg

JosetteSchoenma commented 2 years ago

The amount of samples has gone up to 115, of which 105 from the USA. I have had CovSpectrum calculate growth advantage for the USA, using this query. It says 42% over BA.5.

https://cov-spectrum.org/explore/United%20States/AllSamples/Past3M/variants?aaMutations=S%3AF486V%2CM%3AD3N%2CS%3AL452R&aaMutations1=S%3AL452R%2CS%3AF486V%2CM%3AD3N%2CN%3AE136D%2CN%3AP151S&nucMutations1=C1931A%2CA17207G%2CC17304T%2CG28681T&analysisMode=CompareToBaseline&

Screenshot_20220616-112242_Chrome.jpg

JosetteSchoenma commented 2 years ago

Actually, Denmark has 33 now, of which many have G26529R instead of A, which I talked about a bit up.

22 of those 33 were uploaded to GISAID yesterday, and are not being found by the normal query with M:D3N. I use M_D3 to find them all on GISAID. M_D3B finds the ones with R. Most of those are from the 6th-10th of May, so very recent. I think it is growing fast there as well!

The screenshot contains all M_D3 plus N_P151S from Denmark, but you can see most resamble the sublinage of this issue.

Screenshot_20220616-120344_Twitter.jpg

FedeGueli commented 2 years ago

Hi @JosetteSchoenma actually i can confirm this is the fastest sublineage circulating in my long list. @corneliusroemer could you give a look at this if i am not wrong?

JosetteSchoenma commented 2 years ago

So, because of the N_D3b and N_D3n situation, and because I am not able to put nucleotide mutations in GISAID, I queried for N_E136D, N_P151S, Spike_L452R, Spike1_F486V and M_D3. This gives 181 samples, of which 134 from the US, 36 from Denmark, 4 from South Africa, 3 from England, 2 from Israel and 1 from Luxembourg and 1 from Northern Ireland. Of which 85 with a collection date in June!

I made a graph in log-scale just for Denmark with the samples from the query, compared to basic BA.5, queried with just Spike_L452R, Spike1_F486V and M_D3. Screenshot_20220616-171623_Twitter.jpg

Here you can see the raw numbers. Denmark has already sequenced 2719 samples for week 23. If only the 10th and 11th of May are calculated, the percentage is already 1,7% (13/785). Screenshot_20220616-200316_Twitter.jpg

Unfortunately, somehow, I am not able to make a new Usher tree with these samples.

JosetteSchoenma commented 2 years ago

I have made a query, using a BA.5 query from another issue, to compare BA.5.3.1 + N:P151S against BA.5.3.1. Growth advantage US 14% -14%-41%, Denmark 78% -21-177%. CI still too wide. https://cov-spectrum.org/explore/Denmark/AllSamples/Past2M/variants?variantQuery=C1931A%26N%3AE136D%26%5B1-of%3A+G26529A%2CC27889T%5D%26%5B3-of%3A+G12160A%2CT22917G%2CT23018G%2CG26529A%2CC27889T%5D%26%21%5B2-of%3A+C9866T%2CA23040G%2CC26858T%2CA27259C%2CG27382C%2CA27383T%2CT27384C%5D%26%5B30-of%3A+C10449A%2CG21987A%2CT22882G%2CC26270T%2CA28271T%2CG22578A%2CA23403G%2CC25000T%2CC12880T%2CC28311T%2CC15714T%2CC17410T%2CG22775A%2CA22786C%2CG4184A%2CA9424G%2CA20055G%2CC22995A%2CA23063T%2CG23948T%2CC22686T%2CG22992A%2CC23525T%2CG28881A%2CT670G%2CA24424T%2CC2790T%2CA23055G%2CC9534T%2CC19955T%2CT23075C%2CT23599G%2CC26577G%2CG28883C%2CA29510C%2CC23854A%2CT24469A%2CC26060T%2CG26709A%2CG10447A%2CT22200G%2CC3037T%2CC21618T%2CA22688G%2CG22813T%2CC27807T%2CC241T%2CC10198T%2CA18163G%2CC25584T%2CC4321T%2CC9344T%2CT22679C%2CA23013C%2CG28882A%2CC10029T%2CC14408T%2CC22674T%2CC23604A%2C%5B1-of%3A+T11288-%2CC11289-%2CT11290-%2CG11291-%2CG11292-%2CT11293-%2CT11294-%2CT11295-%2CT11296-%5D%2C%5B1-of%3A+T21633-%2CA21634-%2CC21635-%2CC21636-%2CC21637-%2CC21638-%2CC21639-%2CT21640-%2CG21641-%2CT21765-%2CA21766-%2CC21767-%2CA21768-%2CT21769-%2CG21770-%5D%2C%5B1-of%3AG28362-%2CA28363-%2CG28364-%2CA28365-%2CA28366-%2CC28367-%2CG28368-%2CC28369-%2CA28370-%5D%5D&variantQuery1=N%3AP151S%26C1931A%26N%3AE136D%26%5B1-of%3A+G26529A%2CC27889T%5D%26%5B3-of%3A+G12160A%2CT22917G%2CT23018G%2CG26529A%2CC27889T%5D%26%21%5B2-of%3A+C9866T%2CA23040G%2CC26858T%2CA27259C%2CG27382C%2CA27383T%2CT27384C%5D%26%5B30-of%3A+C10449A%2CG21987A%2CT22882G%2CC26270T%2CA28271T%2CG22578A%2CA23403G%2CC25000T%2CC12880T%2CC28311T%2CC15714T%2CC17410T%2CG22775A%2CA22786C%2CG4184A%2CA9424G%2CA20055G%2CC22995A%2CA23063T%2CG23948T%2CC22686T%2CG22992A%2CC23525T%2CG28881A%2CT670G%2CA24424T%2CC2790T%2CA23055G%2CC9534T%2CC19955T%2CT23075C%2CT23599G%2CC26577G%2CG28883C%2CA29510C%2CC23854A%2CT24469A%2CC26060T%2CG26709A%2CG10447A%2CT22200G%2CC3037T%2CC21618T%2CA22688G%2CG22813T%2CC27807T%2CC241T%2CC10198T%2CA18163G%2CC25584T%2CC4321T%2CC9344T%2CT22679C%2CA23013C%2CG28882A%2CC10029T%2CC14408T%2CC22674T%2CC23604A%2C%5B1-of%3A+T11288-%2CC11289-%2CT11290-%2CG11291-%2CG11292-%2CT11293-%2CT11294-%2CT11295-%2CT11296-%5D%2C%5B1-of%3A+T21633-%2CA21634-%2CC21635-%2CC21636-%2CC21637-%2CC21638-%2CC21639-%2CT21640-%2CG21641-%2CT21765-%2CA21766-%2CC21767-%2CA21768-%2CT21769-%2CG21770-%5D%2C%5B1-of%3AG28362-%2CA28363-%2CG28364-%2CA28365-%2CA28366-%2CC28367-%2CG28368-%2CC28369-%2CA28370-%5D%5D&analysisMode=CompareToBaseline&

JosetteSchoenma commented 2 years ago

It is 419 samples now with S:L452R, S:F486V, M:D3, N:P151S and N:E136D. In 16 countries. 283 in the USA and 12 in Denmark, but for example also one in Australia and New Zealand, 2 in Mexico and 3 in Pakistan.

FedeGueli commented 2 years ago

@JosetteSchoenma do every sequence in this lineage have M:D3B?

JosetteSchoenma commented 2 years ago

Sorry,I read incorrectly. Most have D3n, just many Danish ones have D3b.

Op zo 26 jun. 2022 13:33 schreef FedeGueli @.***>:

@JosetteSchoenma https://github.com/JosetteSchoenma do every sequence in this lineage have M:D3B?

— Reply to this email directly, view it on GitHub https://github.com/cov-lineages/pango-designation/issues/723#issuecomment-1166503073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWTQDO6CLNSYQESGWB4LBXDVRA5Y7ANCNFSM5YCBBBRQ . You are receiving this because you were mentioned.Message ID: @.***>

JosetteSchoenma commented 2 years ago

I changed the query to include the Danish M:D3B ones. 575 samples in total now https://cov-spectrum.org/explore/World/AllSamples/Past3M/variants?aaMutations=S%3AL452R%2CS%3AF486V%2CN%3AP151S%2CN%3AE136D&nucMutations=C1931A%2CA17207G%2CC17304T%2CG28681T&aaMutations1=S%3AL452R%2CS%3AF486V%2CM%3AD3N%2CN%3AP151S& Growth seems to be slowing down in Denmark. Growth advantage over BA.5* in Denmark 19% (-3-40%) and in the USA 32%(22-41%).

FedeGueli commented 2 years ago

737 seqs as today. It slowed down a bit but still very fast one.

InfrPopGen commented 2 years ago

This seems to have been designated by someone as BE.3, as this lineage is listed as rooted at node_1649910 (on the 2022-06-30 tree) and defined by C28724T (N:P151S), and I think that's a match for the proposal here.

FedeGueli commented 2 years ago

Thx very much @InfrPopGen !

JosetteSchoenma commented 2 years ago

No, the Danish often have N:D3b instead. Like their normal BA.5. Still not sure why that is.

Op zo 26 jun. 2022 13:33 schreef FedeGueli @.***>:

@JosetteSchoenma https://github.com/JosetteSchoenma do every sequence in this lineage have M:D3B?

— Reply to this email directly, view it on GitHub https://github.com/cov-lineages/pango-designation/issues/723#issuecomment-1166503073, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWTQDO6CLNSYQESGWB4LBXDVRA5Y7ANCNFSM5YCBBBRQ . You are receiving this because you were mentioned.Message ID: @.***>