Closed silcn closed 2 years ago
Man, we must be on the same wavelength, Silcn, because I noticed one sequence of this yesterday. I wasn't quite sure what to make of it—whether it was real or not, coming from India, as I don't really have the expertise to judge such things. Excellent spot, mate.
Now found in Germany: EPI_ISL_13378378, EPI_ISL_13378924 And also in Canada: EPI_ISL_13392500 EPI_ISL_13389935 is another Indian sequence, with NNNs covering the locations of all the S mutations but it shares all the non-S mutations.
More Indian sequences: EPI_ISL_13409385, EPI_ISL_13409444, EPI_ISL_13409465
Hi @silcn this morning i found a cluster in Germany with similar mutations (2 out of 8) S:R346K, S:H1101Y, S:460K, S:K147E) and the 493 reversion in Germany: EPI_ISL_13380484 EPI_ISL_13344896 EPI_ISL_13393217 EPI_ISL_13387937 EPI_ISL_13387015 EPI_ISL_13385893 EPI_ISL_13382835 EPI_ISL_13269776 EPI_ISL_13269123 EPI_ISL_13382561 EPI_ISL_13382506
I cant find them on Usher but maybe you can try to see if they are related or not.
@FedeGueli good spot! Those aren't related - they look like a BA.2/BA.1/BA.2 double-breakpoint recombinant with the first breakpoint between 12880 and 15240, and the second somewhere between 21641 and 21846, i.e. S:27 and S:95 (S:69/70del is present but not S:A67V so not clear whether that bit comes from BA.1 or BA.2). The BA.2 bit in the spike has those extra mutations as you say.
Although this is still fairly limited in absolute sequence number, the divergent mutation profile, the wide geographical spread, and the rapidness that new sequences have emerged makes me think this should get designated pretty soon to faciliate its monitoring.
@silcn most of these sequences have T5386G, and 2 of them also have A11537G though. I mean those brought up by @FedeGueli
@c19850727 hmmm, so they do. Doesn't seem too implausible for that to be real if a chronic infection was involved, as suggested by the S mutations? Multiple labs are involved so I doubt it's contamination.
I Sc2rfed the German ones. I think the 2 breakpoints are between 13195 and 15240 and between 21618 and 21762. Same regions as @silcn suggested, just narrowed the window a bit But loads of private mutations and all with an extra BA.1 T5386G and 2 with an extra BA.1 A11537G. Maybe something to keep an eye on and to open up its own issue?
@JosetteSchoenma please do it! you made the analysis i just found it. It is worth flagging it. i will monitor it , at least then we could close the issue if nothing new pops up
Thanks for submitting. We've added lineage BA.2.75 with 3 newly designated sequences. Defining mutation(s) A22001G (S:K147E), T22016C (S:W152R).
@InfrPopGen the 4 sequences you've added to lineages.csv contain India/MH-INSACOG-CSIR-NEERI1939/2022 twice, which I presume is an error?
Thank you @silcn I've deleted the duplicate line!
Among the 20 new BA.2.75 uploads from Maharashtra today is an apparent outlier sequence, EPI_ISL_13502528, which has S:147E, 210V, 257S, 339H, 493Q and ORF1a:1221L but is missing the other defining mutations and instead has some more of its own (none in Spike though). It is classified as BA.2 rather than BA.2.75 by Usher. If this spreads then it could potentially deserve a separate designation; I'll keep looking out for more.
Another of the new Maharashtra sequences (EPI_ISL_13502546) has S:681R and ORF8:27* - something else to keep an eye on...
@shay671 I count at least 46, 47 if you allow EPI_ISL_13502528. NSP3_P822S, NSP8_N118S in GISAID is good at picking them up for now.
As this lineage is getting quite a bit of attention I've hidden a couple of comments earlier on about an unrelated recombinant to prevent confusion - please do feel free to open a new designation issue if the recombinant continues to grow
NSP3_P822S, NSP8_N118S in GISAID is good at picking them up for now.
Of course today we get some sequences missing NSP8_N118S... not sure there's a single GISAID query that captures everything anymore, but by my count 13 new BA.2.75 sequences were uploaded from India today.
Here is the case count by country/region for the 53 cases i found yesterday.
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
India | Haryana | 3 -- | -- | -- | Himachal Pradesh | 3 | Jammu and Kashmir | 1 | Karnataka | 10 | Maharashtra | 23 Germany | Baden-Wurttemberg | 1 | Rhineland-Palatinate | 1 UK | England | 5 Canada | Alberta | 1 | Ontario | 1 Australia | Victoria | 2 New Zeland | 2
Proposal for a sublineage of BA.2 Earliest sequence: 2022-06-02 (India) Countries detected: India (5 seq, from 3 states)
Defining mutations: S:K147E, W152R, F157L, I210V, G257S, D339H (mutated from G339D), G446S, N460K, R493Q (reversion) ORF1a:S1221L, P1640S, N4060S ORF1b:G662S E:T11A
Don't think much needs to be said to explain why I'm proposing this. Very recent, long branch with 9 new spike mutations, detection in multiple states that aren't all close together (Maharashtra, Karnataka, Jammu and Kashmir). I expect quite a few people have been monitoring it :)
Usher tree is a bit messy because of some poor quality sequences, particularly the one from Jammu and Kashmir which has multiple artefactual reversions. As a result this lineage is placed on a branch with a couple of Indian sequences with reversions at S:954 (also probably erroneous - apparent reversions at this site seem to crop up a lot in Indian sequences). Despite what the tree shows, the evidence is currently consistent with all 9 S mutations having appeared on the same long branch. As usual, the 2nt mutation at S:339 is mislabelled.
https://nextstrain.org/fetch/github.com/silcn/subtreeAuspice1/raw/main/auspice/subtreeAuspice1_genome_42b02_161030.json?branchLabel=Spike%20mutations&c=gt-S_446&label=nuc%20mutations:C3796T,C3927T,C4586T,C5183T,A12444G,G22577C,G22898A,T22942G,G23040A,A26275G
Genomes: EPI_ISL_13302209 EPI_ISL_13302252 EPI_ISL_13373059 EPI_ISL_13373170 EPI_ISL_13375776
Edit: cov-spectrum query https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?variantQuery=%5B6-of%3A+S%3A147E%2C+S%3A152R%2C+S%3A157L%2C+S%3A210V%2C+S%3A257S%2C+S%3A339H%2C+S%3A446S%2C+S%3A460K%2C+ORF1a%3A1221L%2C+ORF1a%3A1640S%2C+ORF1a%3A4060S%5D& Missing some sequences that only have a month of collection