cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.05k stars 98 forks source link

Highly divergent BE.1 sublineage with 2 Spike mutations (S:P384H, S:T604N) and additonal 8 aa substitutions ( c. 30 seq, Germany) #764

Closed agamedilab closed 2 years ago

agamedilab commented 2 years ago

Proposal for a sublineage of BE.1 Earliest sequence: 25.05.2022 (Germany) Countries detected: Germany

Defining mutations: S:P384H, S:T604N, ORF1ab:Q1003K, ORF1ab:Q2510K, ORF1ab:A3436D, ORF1ab:I3758V, ORF1ab:L3829F, ORF6:T10N, ORF7b: I27F, N:A90D

This is apparently a highly divergent sublineage of BE.1 carrying a total of 10 additonal aa mutations compared to BE.1:

Two spike mutations (S:P384H, S:T604N), five ORF1ab mutations (ORF1ab:Q2510K, ORF1ab:A3436D, ORF1ab:I3758V, ORF1ab:L3829F) and one ORF6 and ORF7b and N mutation, respectively (ORF6:T10N, ORF7b: I27F, N: A90D). This cluster of sequences (c. 30) was only detected in Germany so far.

grafik grafik

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1008d_c38c70.json?branchLabel=aa%20mutations&c=gt-S_384&label=nuc%20mutations:G16935A

The intra clade diversity is also very high with almost each sequence carrying addtional aa mutations:

grafik

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_1008d_c38c70.json?branchLabel=aa%20mutations&c=country&label=nuc%20mutations:C3272A,C7793A,C10572A,C16611A,C22713A,C27230A,C28542A

Sequence quality is apparently ok:

grafik

EPI_ISL_13241291, EPI_ISL_13241313, EPI_ISL_13241342, EPI_ISL_13241298, EPI_ISL_13241302, EPI_ISL_13241303, EPI_ISL_13241255, EPI_ISL_13241273, EPI_ISL_13241343, EPI_ISL_13241316, EPI_ISL_13241276, EPI_ISL_13241260, EPI_ISL_13241269, EPI_ISL_13241238, EPI_ISL_13241246, EPI_ISL_13241268, EPI_ISL_13241271, EPI_ISL_13241253, EPI_ISL_13241293, EPI_ISL_13241286, EPI_ISL_13241262, EPI_ISL_13241344, EPI_ISL_13241240, EPI_ISL_13241325, EPI_ISL_13241325, EPI_ISL_13241337, EPI_ISL_13241264, EPI_ISL_13241305, EPI_ISL_13241261, EPI_ISL_13241245, EPI_ISL_13241314

corneliusroemer commented 2 years ago

Haven't looked at this in detail but high clade diversity points at artefact

Please always add a covSPECTRUM query to facilitate monitoring and analysis of the proposal

agamedilab commented 2 years ago

@corneliusroemer many thanks! I usually add covspectrum queries (i admittely forget to add the link in #724) when possible...

ryhisner commented 2 years ago

ORF1ab:A3436D is NSP5_A173D, which is adjacent to a nucleotide whose mutation (NSP5_H172Y) is known to confer very high resistance to Paxlovid treatment (233-fold reduced nirmatrelvir activity in cell culture). image https://www.fda.gov/media/155050/download

silcn commented 2 years ago

@corneliusroemer artefacts in German sequences would be nothing new - reminds me of those ones with a bunch of S mutations around 1007-1014 which still keep showing up

agamedilab commented 2 years ago

Hm... i found now a batch of 22 seqs of the same RKI upload (13.06.22) carrying very much the same set of mutations but belonging to BA.2:

grafik

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_129e5_ccf750.json?branchLabel=aa%20mutations&c=gt-S_384&label=nuc%20mutations:G20679T

https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?aaMutations=ORF6%3AT10N%2CN%3AA90D%2CS%3AP384H%2CS%3AT604N%2CORF1a%3AQ2510K%2CORF1a%3AA3436D%2CORF1a%3AI3758V&pangoLineage=BA.2*&

I now also think this probably too much of a coincident. Having that set of muations appearing in two lineages uploaded within the same batch is very suspicous...

silcn commented 2 years ago

As a final nail in the coffin, note that many of the AA mutations within the apparent clade are either reversions of BE.1 mutations or are associated with other BA.5 lineages (N:D136E, ORF1a:K556Q, ORF10:L37F...). It looks like sequences from all across the tree have been grouped together by Usher because of a shared set of artefactual mutations.

agamedilab commented 2 years ago

many thanks, i think i can close this

FedeGueli commented 2 years ago

I noticed that the bigger branch with Orf1a:L3829F now in Usher appears as BE.1.1, the usher tree you posted here is now unavaible so i cant see if this new designated lineage includes these probably artefactual sequences or not. Maybe this has to be checked @InfrPopGen

agamedilab commented 2 years ago

@FedeGueli Hi , yes BE.1.1 now apparently includes those probably artefactual sequences:

grafik

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_d552_84ee40.json?branchLabel=Spike%20mutations&c=pango_lineage_usher&label=nuc%20mutations:C241T,T670G,C1931A,C2790T,C3037T,G4184A,C4321T,C9344T,A9424G,C9534T,C10029T,C10198T,G10447A,C10449A,G12160A,C12880T,C14408T,C15714T,G16935A,C17410T,A18163G,C19955T,A20055G,C21618T,T22200G,G22578A,C22674T,T22679C,C22686T,A22688G,G22775A,A22786C,G22813T,T22882G,T22917G,G22992A,C22995A,A23013C,T23018G,A23055G,A23063T,T23075C,A23403G,C23525T,T23599G,C23604A,C23854A,G23948T,A24424T,T24469A,C25000T,C25584T,C26060T,C26270T,G26529A,C26577G,G26709A,C27807T,C27889T,A28271T,C28311T,G28681T,G28881A,G28882A,G28883C,A29510C

FedeGueli commented 2 years ago

Thx @agamedilab

AngieHinrichs commented 2 years ago

I noticed the other day when looking into sequences whose pangolin/UShER lineage assignment will change in the next release that many sequences from Germany have nt:T11524C and A11537G (ORF1a:I3758V) together despite appearing in very different branches of BA.2 / BA.5. Since that only seems to be the case for sequences from Germany I suspect some kind of systematic error. [In case A11537G (ORF1a:I3758V) sounds familiar, it is found in BA.1, but not T11524C.]