Closed Sinickle closed 2 years ago
The main reason this genome stands out to me is the S:417T + S:444N combo, but having two additional spike mutations and having spread to 4 continents while likely being from an under-sequenced area makes this more worthwhile to designate than other larger S:417T + S:444N combo lineages, in my opinion.
Bloom labs believes S:444 is one of the most important sites to watch for mutations for, as a method of achieving immune escape. I claim that S:444N becomes more beneficial after S:417T is acquired.
The following tool does not show all sequences, but is hopefully sufficient for the argument.
It contains 6100 Omicron+S:417T and 115 Omicron+S:444N. There are 58 sequences that have both.
Omicron sequences with S:417T appear to have subsequently independently acquired S:444N at least 6 times
3 of these 6 times, the child sublineage was sequenced at least 5 times
S:444N appeared independently on any Omicron 51 times
Of these 51 times, only 3 times did it have at least 5 child sequences. These are the same 3 times that it also had S:417T.
Since Omicron with S:417T has never made up even 4% of cases for a time period, I believe these numbers argue that S:444N is more likely to succeed when S:417T is present.
This possibility makes it important to monitor a lineage that has this pair of mutations.
Excellent job. Regarding for if this is a jump or genetic drift - This lineage is 5 mutations from the closest sequence, but 4 mutations from the ancestor shared with the Indian clade in the picture. Those four are all Nonsynonymous, 1 in NTD, one in RBD, and one in proximity to the furin cleavage site. Three converging hot spots. So if this is indeed an inter-host accumulation of mutations, there's s evidence here for strong selection (we would have expected some synonymous or ORF1ab mutations combining in the way)
Very small so will close for now but can reopen if this grows with an epidemiological event in the future
Hi @chrisruis this is small but sampled in 4 contintents, clear link to an undersampled area (India). I suggest probably to apply the monitor label and leave this open even if probably no BA.2 sublineage could compete with BA.5
5 continents now, sampled in Denmark (June)
Found 3 sequences from Telangana, India: EPI_ISL_13307928-13307930. They're all pretty poor quality, but share many of this lineage's defining mutations, in particular S:147E, S:692L and nuc:T7153C. So even though one is missing ORF1b:D51N and all three have S:417N rather than 417T (backfilling? or a genuine reversion?) I feel it's safe to say they're from this lineage. They all have NNNs at S:444.
The long branches in the Usher tree are most likely a reflection of the poor sequence quality rather than genuine diversity.
Hi @silcn good catch. i have found another sequence (earlier in may) from Thailand, this time missing S:147E but with orf1b:51N and 7153C and it should have S:417T and S:444N beside 25416T EPI_ISL_13065568 that could be related with this lineage.
@FedeGueli well spotted, that one is also missing S:692L though so it's probably not part of the proposed lineage though it's very closely related
@silcn thank you yes agree related but not part of it and also very low quality, pointed to me by @ryhisner , that makes harder to.understand how much closely it is related or not. Have you verified if there were some uptick in cases or +rate in that region of India where your newly found sequences come from?
Edited found this: https://telangananewspoint.com/telangana-reports-285-new-covid-19-cases-on-thursday/ Generally India is seeing a.moderate uptick of cases with over 12k cases threshold after 111 days . Impossible to relate that with this or other sublineages
The three sequences I found are from the city of Hyderabad, one sampled 2022-05-26 and two on 2022-05-28. For a sense of scale, 116 sequences have been uploaded from Hyderabad with a collection date since 2022-05-26, of which 31 are either BA.2.12.1, BA.4 or BA.5.
If there is somewhere in India where this is very prevalent, then it likely isn't Hyderabad, that's just where we happened to get the first sequences from.
And now a better quality Indian sequence, EPI_ISL_13342191 from Chennai, Tamil Nadu
West Bengal just uploaded 32 new sequences from this lineage. Total is now over 50.
55 sequences on Usher as today. 47 out of 55 are from India confirming what was early discovered by @silcn. https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_30d7a_2fb660.json?branchLabel=aa%20mutations&c=pango_lineage_usher&label=nuc%20mutations:A22001G,A23636C
The tree now shows the recent uploaded indian sequences mentioned by @silcn and we can appreciate the diversity in the tree of this sublineages that was first spotted by introduction in different continents.
As requested by me in #814 this lineage is worth a designation and it would help indian authorities in tracking their things there as stated for BA.2.75.
Just spottet one of those in Austria but with additionally S:K558N (EPI_ISL_13632738):
This lineage is now up to 122 sequences and is placed next to a new interesting sublineage with K444N and F157S that has 63 sequences. 157 has been seen to confer a growth advantage on its own. Seeing that the number of sequences continues to grow and this branch has now spawned a second interesting spike profile, I think that this should be designated
Thx @bitbyte2015! The sibling lineage with S:F157S has been proposed in #828.
Hi @chrisruis following our conclusions in #814 probably these two sublineages deserve a designation also considering that at least 5 high trasmissible lineages are circulating there (BA.2.75, BF.3, BA.2.38.1, BA.2 74, BA.2.76)
@chrisruis @corneliusroemer this now seems very close to BA.5 in India: https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?aaMutations=M%3A3N&nucMutations=12160A&aaMutations1=S%3AK147E%2CS%3AK444N%2CS%3AI692L%2CS%3A417T&analysisMode=CompareToBaseline&
@FedeGueli brought it to my attention that this proposed sublineage is much better captured with just searching S:147E, S:692L. This is because a large amount of the sequences have dropout at either S:417T or S:444N. The new query is here Compared to BA.2.38, this has a [71-129%] growth advantage in India.
thx @Sinickle for highlighting here the new query. Notably with the updated query it shows a slight growth advantage also versus BA.5 in India: https://cov-spectrum.org/explore/India/AllSamples/Past3M/variants?aaMutations=M%3A3N&nucMutations=12160A&aaMutations1=S%3AK147E%2CS%3AI692L&analysisMode=CompareToBaseline&
Following this update i think this should be reopened and monitored along the other open issues @chrisruis @AngieHinrichs @InfrPopGen @corneliusroemer .
It has also to be noticed that summed to the sibling lineage with S:444N 7153C and S:157S it represents between 1/2 and 1/3 of the BA.2 38 samples with S:444N
285 sequences as today. after a bit withiut new samples a lot came in all together
@corneliusroemer @chrisruis @InfrPopGen
i am monitoring it since mentioned in #814
While ita growth it is very irregular probably due prevalence in undersampled area i think it deserves to be reopened and designated along the other BA.2.38 +478R.
Now it has been sampled in 11 different countries: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_39bdb_a5caa0.json?branchLabel=aa%20mutations&c=country&label=nuc%20mutations:G13618A
Big upload of sequences in the last three days: 52 seqs added mainly by West Bengala but also Sikkim (3) ,Karnataka (3) Chhattisgarh(1).
Note that ~10% of them.have been collected since 18 July through today. so very recent indicating circulation.
In the last month this sublineage represented between 2 and 3% of total BA.2.38 sequences, representing the fourth sublineage after .1/.2/.3 i do think it is worth reopen and designate
@corneliusroemer @thomaspeacock @InfrPopGen @chrisruis
cc @silcn have you any note on this lineage?
I guess with Xie's BA.2.75 paper identifying S:147E as a substantial immune escape mutation, and Bloom Lab identifying S:444 mutations also as substantial immune escape, it provides some mechanism for why this proposed lineage is maintaining a growth advantage over BA.2.38.
I'll reopen since numbers have increased by a factor of 50 since the issue was closed
No sequence in the last two weeks.
Well, looks like this one might be basically dead at this point. The 3 spike mutations it gained that surpassed 1% prevalence but less than 5% were... S:346T, S:452M, S:460K. (not all on the same samples)
Edit: or I just got tricked by upload schedules
36 sequences popped out just now. 4 from Assam, 1 from Karnataka, and 31 from West Bengal. Dates of collection are between 2022-06-15 and 2022-08.
This was very successful in India for a short period - until BA.2.75 killed it.
Still worth a designation I think - but low priority as it's dead.
Agree with the need for designation. We understand now how important it is to keep track of convergence.
Thanks for submitting. We've added lineage BA.2.38.4 with 293 newly designated sequences, and 5 updated designations from BA.2.38. Defining mutation A23636C (S:I692L) (following A22001G (S:K147E)).
The main reason this genome stands out to me is the S:417T + S:444N combo, but having two additional spike mutations and having spread to 4 continents while likely being from an under-sequenced area makes this more worthwhile to designate than other larger S:417T + S:444N combo lineages, in my opinion.
Bloom labs believes S:444 is one of the most important sites to watch for mutations for, as a method of achieving immune escape. I claim that S:444N becomes more beneficial after S:417T is acquired.
The following tool does not show all sequences, but is hopefully sufficient for the argument.
- It contains 6100 Omicron+S:417T and 115 Omicron+S:444N. There are 58 sequences that have both.
- Omicron sequences with S:417T appear to have subsequently independently acquired S:444N at least 6 times
- 3 of these 6 times, the child sublineage was sequenced at least 5 times
- S:444N appeared independently on any Omicron 51 times
- Of these 51 times, only 3 times did it have at least 5 child sequences. These are the same 3 times that it also had S:417T.
Since Omicron with S:417T has never made up even 4% of cases for a time period, I believe these numbers argue that S:444N is more likely to succeed when S:417T is present.
This possibility makes it important to monitor a lineage that has this pair of mutations.
re reading this now it as really the first talk on convergent evolution in omicron rbd . great thought @Sinickle
Credit to @bitbyte2015 for being the original one (to my knowledge) to find these sequences!
This potential sublineage features interesting mutations that make it distinct, and although it has been sequenced just 7 times, it has been found in Brazil, Australia, USA, and Japan (but the Japanese sequences were in travelers from India)
Description
Potential sublineage of: BA.2.38 (the Indian sublineage with spike 417T from a BA.2 root)
Gene | Amino changes -- | -- ORF1b | D51N Spike | K147E, K444N, I692LEarliest sequence: 2022/05/06 (Australia)
Most recent sequence: 2022/05/22 (USA-California)
Mutations on top of BA.2:
nextstrain tree [cov-Spectrum query]
The lineage I'm proposing is the one in the red box. Since the parental lineage, BA.2.38 without 6091T, is most predominant in India, and the Japanese cases in my proposed lineage are confirmed to be from India-related travel, it seems likely that this lineage is more prominent in India despite never being sampled there. Additionally, this lineage is 5 nucleotide mutations away from any possible parental sequence which likely implies lack of sampling of intermediaries rather than a true evolutionary jump. The Indian sequences not in the red box in the screenshot share just 1 synonymous nucleotide mutation, and several unique ones -- I believe that the right thing to do here would be to include the Indian non-red box sequences as regular BA.2.38 in the classifier training set, and the red box ones as the new proposed lineage, which would be defined by T7153C + the previously listed AA mutations.
Proposed sublineage: EPI_ISL_13175338, EPI_ISL_12808436, EPI_ISL_12808434, EPI_ISL_13027402, EPI_ISL_12808520, EPI_ISL_13027403, EPI_ISL_12767836
Regular BA.2.38 that share T7153C (think including these in the training set might improve specificity of the proposed lineage, especially since one of these also obtained S:444N independently.): EPI_ISL_12953126, EPI_ISL_12953226, EPI_ISL_12953163, EPI_ISL_12953161
EDIT: As others point out in the comments below, another notable branch has formed, starting at the S:444N mutation. If my proposed branch is designated, we should be careful that the other branch doesn't get misclassified.