Closed corneliusroemer closed 2 weeks ago
usher tree for FW.1.1 looks like GW.5.1.1, I guess it may have similar deletion.
I'm not sure Usher is helpful here as it completely ignores anything that happens in deletions/Ns. In both cases, there should be plenty of raw reads on SRA in case anyone wants to investigate what's going on - that should be the way to go.
The trick is ins27384C
search it in GISAID you will find all lineages with similar ORF6 stop codon and ORF7a start codon destroyer
usher tree for FW.1.1 looks like GW.5.1.1, I guess it may have similar deletion.
XBB.1.28 has already a frameshift at the start of ORf6
@ryhisner found a couple more and going. Hope he will update here. Once updated i will add them to main text
About 50% of XBB.1.5.71 have the ORF7a-7b-8 deletion. GISAID Query: C21691T, A13953G, C281T, -del27796_27797, -G27915T, -C27807T (356 seq) Usher Tree:https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/XBB.1.5.71_with_7a-7b-8_Deletion_356_seq.json?c=gt-ORF1ab_6&gmax=21555&gmin=266&label=id:node_3734774 Genomes
JP.1.1 (CH.1.1.31.1.1), which is still circulating in South Africa (3/19 sequences collected in December) and also has shown up in England and Wales this month, also has the ORF7a-7b-8 deletion. GISAID Query: G7059A, G21668A, C5183T, -G27459T, -C27807T Usher Tree: https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/JP.1.1__7a-7b-8_Deletion.json?c=gt-nuc_21668&gmax=22668&gmin=20668&label=id:node_4351062 Genomes
There's a branch of EG.5.1, mostly in Japan, that seems to have the last half of ORF7b and ORF8 deleted. Probably about to go extinct though. The Japan sequences actually may get the deletion exactly right, but Nextclade can't read them. I recorded a bunch of info on that branch a couple months ago, but I'm not sure if I can find it again.
I think there are a couple others as well. I'll see if I can find them.
EG.5.1.12 with something resembling an ORF7b:26-ORF8:122 deletion GISAID Query: C6543T, C28311T, A16878T, C2334T, C28909T, -G27915, -G28209C, -C28146T Usher Tree: https://nextstrain.org/fetch/raw.githubusercontent.com/ryhisner/jsons2/main/EG.5.1.12_ORF7b-ORF8_Deletion.json?c=gt-nuc_6543&gmax=7543&gmin=5543&label=id:node_3436821
Okay, so the ORF7b-ORF8 deletion is part of EG.5.1.12. Again, a lot of the Japanese sequences might get the deletion right. GISAID registers those sequences as having T27833A, T27835A, del27837_28261 in some cases, but at other times as del27831_28049, T28053C, C28054G, G28056A, del28059_28256.
When I put one of the Japan sequences in BLAST, it looks like ∆27832-27885, ∆27896-28257. 27832 is the second nucleotide in ORF7b:26 while 28257 is the first nucleotide in the ORF8 stop codon, ORF8:122, and the last nucleotide before the core N/ORF9b core TRS (AAACGAAC).
I think all these interpretations are more or less equivalent. ORF7b hits a stop codon pretty quickly and ORF9b/N gain a second TRS to go with the two overlapping ones they already have. In any case, this is a quickly disappearing lineage, but it might be worth documenting for the historical record.
thank you Ryan!
There's a ~53-sequence branch of FY.1.2 with this deletion, including a 20-sequence branch with S:N481K and N:A267V that includes a recent sequence from a traveler from Columbia sequenced in the US (by Ginkgo Bozoworks, unfortunately, so bad quality).
The S:N481K-N:A267V branch has 20 sequences, but only three of them are on GISAID, mostly from New York, USA. It seems one of the New York labs doesn't upload to GISAID. 13/20 sequences on this branch were collected in November or December.
Earliest sequence: 2023-6-26 - Israel, EPI_ISL_17953535 Most recent sequence: 2023-12-21 - Colombia (Traveler to USA), EPI_ISL_18710009 Continents circulating: Asia (29), Europe (4), North America (2), South America (1) Countries circulating: Israel (29), USA (2), Colombia (1—traveler to USA), England (1), Finland (1), Germany (1), Wales (1) Number of Sequences: 36 (subtracting one Ginkgo Bozoworks pooled sample) GISAID Nucleotide Query: G18811A, C14120T, A13947C, C4795T, A22161G, -del27795, -A21794C
Wider View
S:N481K-N:A267V Branch (85% not on GISAID)
Genomes
@corneliusroemer new one , sibling of GE.1.2.1 but without 376S and the 478Rev, likely there was a common ancestor with it : https://github.com/sars-cov-2-variants/lineage-proposals/issues/1340
@corneliusroemer @ryhisner spotted another one: JB.2 undesignated branch in South Africa
@corneliusroemer new one , sibling of GE.1.2.1 but without 376S and the 478Rev, likely there was a common ancestor with it : #1340
Designated GE.1.2.2
No new seqs
ins27384C
Only 12 samples so far with the large deletion in Orf7ab/8 on Jn.1 Backbone (USING THE QUERY BY @NkRMnZr ) EPI_ISL_18818052, EPI_ISL_18836042, EPI_ISL_18902167, EPI_ISL_18910469, EPI_ISL_18931552, EPI_ISL_18932329, EPI_ISL_18932331, EPI_ISL_18951356, EPI_ISL_18981826, EPI_ISL_19095772, EPI_ISL_19136921, EPI_ISL_19175463,
Interestingly there are twocollected in May both in fast lineages : LB.1 with S:S31del in Spain. Query: ins27384C,t3565C,C19512T, A28993C and KP.3.2 without S:S31del in Canada. Query: ins27384C,t3565C,G3871T,
cc @ryhisner did you spot any other?
no recent lineage with it , apparently . closing this
Inspired by @cassiawag's work on large deletions in ORF7/8 often masquerading as stretches of Ns in assemblies rather than as deletions, I had a look through currently submitted sequences to see if I can find more examples.
@ryhisner has already pointed out that GW.5.1.1 seems to have a large ORF8 deletion.
I thought it might be useful to add further examples here. There's a potential for a simple tool to be written that would autodetect such suspicious stretches of Ns and be able to distinguish between amplicon dropout and deletions, e.g. by checking that the same stretch of N is observed in almost all sequences from a lineage and also from a number of labs/countries etc.
I found:
I didn't expect them to have the same stretch - maybe these macro deletions are due to amplicon dropout in mostly English samples after all. Could be that there is a real short indel that throws primers off in that region.
Edited by mod.: Lineages spotted by community ( see comments below) : FW.1.1 JP.1.1 XBB.1.5.71 (50%) EG.5.1.12
Edited : suggested query by @NkRMnZr :