cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 97 forks source link

BA.5.2.1 Saltation lineage, with 5 spike mutations (S:H245Y, R346T, L368I, K440R, A942T) and two reversions S:F486F,Orf9b:D16D (China , Liaoning/Guangdong 9 sequences) #1919

Closed FedeGueli closed 1 year ago

FedeGueli commented 1 year ago

Here i want to propose a first big saltation lineage emerging in China after discussing briefly with @ryhisner and @c19850727.

Looking at the basal sequences in the tree i suspect this could have been generated in a pre big wave chronic infection, in fct it descends from an old BA.5.2.1 branch with Orf6:P57L (that emerged likely in Africa) and had circulated back in summer 2022 defined by S:T747I mutation Schermata 2023-04-20 alle 01 00 44 I found this saltation long branch looking for S:N440R mutation that lately is becoming a good marker of chronic infection derived lineages.

Defining mutations: BA.5.2.1 > T6979G > C26625T > ORF6:P57L (C27371T) > ORF1a:I616T (T2112C )> T6394C >S:T747I( C23802T) > ORF1a:P971L (C3177T) (Start of the saltation)>> C1594T, ORF1a:N615K (C2110A), ORF1a:K1529R (A4851G), T6673C, C6968T, ORF1a:A2355G (C7329G), C8290T, A8992G , ORF1a:T4175I ( C12789T) ,ORF1b:G662S (G15451A), ORF1b:S2658F (C21440T), C22120T, S:H245Y (C22295T), S:R346T (G22599C),S:L368I (C22664A), S:N440R (A22881G), REV S:F486F (T23018T) ,S:A942T (G24386A), E:V58I (G26416A), REV ORF6:L57P T27371C, REV G28330A, T29417C**

Mutations : ORF1a :N615K, P971L, K1529R, A2355G ,T4175I (EDITED first i missassigned to synonymous mutation A8992G a non existent ORF1a:2909V mutation thx @c19850727) ORF1b:G662S, S2658F , S:H245Y R346T L368I, K440R, A942T E:V58I

Reversions: As @aviczhl2 , @ryhisner @c19850727 pointed to me all the 5 samples displays on Gisaid (but not the original chinese genbank ones) the reversion S:F486F (EDITED) It has A28330A reverting BA.5.2 defining mutation: so it has also Orf9B:D16D ( EDITED Thx @silcn ) While it has also Orf6:P57P reverted from its own root Orf6:57L

Convergence: Notably it has S:L368I and Orf1b:G662S as XBB*

Tree: Schermata 2023-04-20 alle 01 41 09 Schermata 2023-04-20 alle 01 42 16

Country: China , all of them from Liaoning

Samples: 5 (with 4 different ages , all of them over 60)

Sequences: EPI_ISL_17493027, EPI_ISL_17495186, EPI_ISL_17495358, EPI_ISL_17495368-17495369

Gisaid query:C22664A,A22881G,C22120T (EDITED)

Alternative Gisaid query: 2110A,A4851G

xz-keg commented 1 year ago

There wasn't many patients before the big wave, however Liaoning is a province next to North Korea, where some virus is likely continuously circulating since 2022.3 and never any sequences from there.

BTW all 5 seqs also have S:V486F reversion.

HynnSpylor commented 1 year ago

great catch! S:K440R seems a little more frequent in China (some other BA.5.2, including BF.7.14 seqs can observe it).

silcn commented 1 year ago

Worth noting given the recent attention on ORF9b mutations that the reversion G28330A is not silent, it encodes ORF9b:G16D.

FedeGueli commented 1 year ago

Thanks everyone, also @c19850727 helped a lot in finding some wrong takes EDITED.

Please @aviczhl2 @c19850727 @ryhisner @silcn give a second look if i missed something more.

Here i attach a tree highlighting the S:F486F reversion:

Screenshot 2023-04-20 alle 11 16 30

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3fe7f_ffcb0.json?f_userOrOld=uploaded%20sample&label=id:node_8154187

And another by @c19850727 highlighting that they still have S:F486V:

Screenshot 2023-04-20 alle 11 18 13

https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3fe7f_ffcb0.json?f_userOrOld=uploaded%20sample&label=id:node_8154187

xz-keg commented 1 year ago

I don't know why usher gives it S:F486V, looking at the original Genbase data, there's no S:F486V. genbase.2537

Screen Shot 2023-04-20 at 21 04 37
thomasppeacock commented 1 year ago

This looks really interesting - I'm just sticking it into monitoring for now due to the low numbers of sequences but if more pop up I think we should prioritise designating this.

AngieHinrichs commented 1 year ago

I don't know why usher gives it S:F486V, looking at the original Genbase data, there's no S:F486V.

Again it's branch-specific masking -- all of BA.5 has the G23018T reversion masked because it often looked like a false reversion that would mess up the tree:

https://github.com/ucscGenomeBrowser/kent/blob/49029c6209ad86693a552fd4dd8e7a4a55d8777f/src/hg/utils/otto/sarscov2phylo/branchSpecificMask.yml#L99

corneliusroemer commented 1 year ago

Single province, some mutations could be artefacts, let's see if this pops up anywhere else

FedeGueli commented 1 year ago

No new sequence came out from China or elsewhere i ll keep monitored prvtly

FedeGueli commented 1 year ago

Reopen to let @ryhisner do some observations on it.

ryhisner commented 1 year ago

Just wanted to add that there's a new sequence in this lineage that has the S:V486F reversion, which is becoming a convergent mutation in chronic-infection sequences lately. [Just saw that @aviczhl2 noticed none of the other sequences appear to have F486V either, so this is likely a shared mutation.] I also noticed that most of the sequences here also have two consecutive AA mutations in ORF1a with ORF1a:N615K, I616T. image

FedeGueli commented 1 year ago

Thanks @ryhisner ! i ll keep it open for a while

ryhisner commented 1 year ago

New sequence from Guangdong popped up today—the first sequence from outside of Liaoning. Identical to the others except for one (supposed) synonymous nuc reversion (C22981T—likely a result of a slightly mistaken Usher-tree arrangement resulting from the failure to register the S:V486F reversion in several sequences), two new synonymous nuc mutations (C19854T, G28085A), and ORF1b:T1655I (C18431T). Collected April 24, which makes it the most recent sequence in this lineage. EPI_ISL_17697616

Zoomed-out and zoomed-in views of the current Usher tree below.

image
ryhisner commented 1 year ago

Liaoning and Guangdong are not neighboring provinces. They're over 1400 miles apart—about the same distance apart as London and Athens.

image
ryhisner commented 1 year ago

Liaoning borders North Korea, so perhaps this lineage arose there. In epidemiological terms, North Korea is essentially an island. It seems possible that lineages could arise and circulate there for a long period of time before reaching other countries.

image
HynnSpylor commented 1 year ago

@ryhisner The long-distance transmission (more than 1000km) does need further tracking.

I have checked the Metadata. The early 5 seqs are collected in Shenyang City, which is not at the border. If it arose in North Korea, some border cross in Jilin Province may also detect it - Actually in last spring, these China-North Korea border cities circulated BA.2.3 sublineage, which was not consistent with other parts of China (mainly BA.2.2 from Shanghai). Hence I suggest it is not likely originated in North Korea.

xz-keg commented 1 year ago

I have checked the Metadata. The early 5 seqs are collected in Shenyang City, which is not at the border. If it arose in North Korea, some border cross in Jilin Province may also detect it

Bordering cities of Jilin and Liaoning have very few sequences uploaded to GISAID.

Number of seqs uploaded to GISAID since 2020: Dandong, 12 Tonghua, 0 Yanbian, 13

With so few seqs it is really hard to "detect".

FedeGueli commented 1 year ago

Thx @ryhisner and everyone for keeping it monitored.

FedeGueli commented 1 year ago

EPI_ISL_17652380 is known to belong to this lineage. i found it looking for different things. just asking.

Checking it is not mentioned here. now ushering

It seems new to me:

collected 2023-04-20 Liaoning Female 72

Schermata 2023-05-26 alle 22 32 03 https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_3eee5_1167d0.json?c=userOrOld&label=id:node_9522215

Alternative Gisaiad query: 2110A,A4851G finds all 7 sequences now.

cc @ryhisner

FedeGueli commented 1 year ago

Schermata 2023-05-26 alle 22 54 35 the Guangdong sample clusters with this last discovered Liaoning one: with

C6224T (ORF1a:H1987Y) + T4579A

Over-There-Is commented 1 year ago

And the Guangdong sequence was detected by Shenzhen Customs, which means it is imported from somewhere out of China Mainland.

Over-There-Is commented 1 year ago

according to GenBase, EPI_ISL_17652380 is collected in Panjin, which is also not bordering to DPRK. image

Over-There-Is commented 1 year ago

I have checked the Metadata. The early 5 seqs are collected in Shenyang City, which is not at the border. If it arose in North Korea, some border cross in Jilin Province may also detect it

Bordering cities of Jilin and Liaoning have very few sequences uploaded to GISAID.

Number of seqs uploaded to GISAID since 2020: Dandong, 12 Tonghua, 0 Yanbian, 13

With so few seqs it is really hard to "detect".

2023-06-01 (1) 2023-06-01 (3) Although still so few, Yanbian and Tonghua actually have uploaded sequences more than this.

Over-There-Is commented 1 year ago

I have checked the Metadata, the sequences uploaded between 04-01 to 04-12 are all tagged as collected in Shenyang, the Capital of Liaoning Province, and also where the Liaoning CDC locate in. But many of them are actually collected in somewhere else. The earliest 5 of the sequences in this lineage are collected then, and they are all actually from Panjin, which is a harbor city. I still suspect this lineage originated in DPRK, but somewhere else is also possible.

FedeGueli commented 1 year ago

thx vm @Over-There-Is precious info!

HynnSpylor commented 1 year ago

Panjin is a harbor city but not the China-DPRK border pass. The Chinese border cities (Dandong, Yanbian, etc.) circulated BA.2.3 lineage in Spring 2022, which is clearly from DPRK, is highly consisted with lineages in South Korea. So if it originated in DPRK, some seqs may also be found in South Korea.

xz-keg commented 1 year ago

Although still so few, Yanbian and Tonghua actually have uploaded sequences more than this.

11 and 14 don't alter the conclusion that it is hard to detect anything.

So if it originated in DPRK, some seqs may also be found in South Korea.

For obvious reasons, passing through NK-SK border is much stricter than NK-Chinese border.

https://en.wikipedia.org/wiki/Balloon_propaganda_campaigns_in_Korea

The official stance of both the South and North Korean governments has been against the continuing balloon drops. However, the South Korean government has been hesitant to intervene in the launches by activists due to concerns about freedom of expression.

Consider SK may fly propaganda balloons to NK, there's single-sided transmission that SK variants are more likely to go to NK but NK variants unlikely to go to SK.
BA.2.3 is SK-NK transmission, likely some BA.5s are also spread from SK to NK through these propaganda balloons and eventually result in this variant due to NK-local mutations.

But for NK-local variants they're hard to go to SK because NK has stopped using balloons.

FedeGueli commented 1 year ago

2 more sequences of this one apparently from Liaoning. One of the two sampled early in May (first sample sequenced in May pointing to an ongoing transmission although with low numbers: 1 out of 20 samples collected in Liaoning in May) Schermata 2023-06-08 alle 10 40 31 https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_b919_1932a0.json?label=id:node_9567104

FedeGueli commented 1 year ago

no sequence of any lineage uploaded from Liaoning. With the next upload we will se if this has died off as it seems.

FedeGueli commented 1 year ago

Closing this one, luckily.