corneliusroemer / pango-sequences

Consensus sequences for each Pango lineage
19 stars 1 forks source link

B.1.617 new mutations in pango-consensus-sequences_summary.json #10

Open ryhisner opened 2 weeks ago

ryhisner commented 2 weeks ago

I'm trying to get a list of the "new" mutations in B.1.617.1, B.1.617.2, and B.1.617.3, including the new mutations that they share. There are some listed in the pango-consensus-sequences_summary.json file for B.1.617.1, B.1.617.2, and B.1.617.3, but none of the mutations they share are listed. I expected to be able to get their shared mutations from the list of B.1.617 "new" mutations. But B.1.617 is full of artifacts and even frameshifts. Here's the list of B.1.617 new mutations and deletions:

"nucSubstitutionsNew": [ "C5700A", "A19321C", "C19322A", "G19327A", "T19332C", "C20384T", "T20949C", "A21215G", "G21987A", "T22917G", "G23012C", "C25469T", "C26681T", "T26767G", "C28170T", "G28209C", "G29737T" ], "aaSubstitutionsNew": [ "M:I82S", "ORF1a:A1812D", "ORF1b:T1952Q", "ORF1b:A1954T", "ORF1b:A2306V", "ORF1b:H2583R", "ORF3a:S26L", "ORF8:P93S", "ORF8:E106Q", "S:G142D", "S:L452R", "S:E484Q" ], "nucDeletionsNew": [ "28881-28883", "28915", "29555", "29692" ], "aaDeletionsNew": [ "N:G204-", "N:G214-" ],

Would it be possible to either eliminate B.1.617 and include the mutations shared among B.1.617.1, B.1.617.2, and B.1.617.3 among each lineages new mutations or else correctly list the new mutations of B.1.617 as being all the mutations shared by B.1.617.1, B.1.617.2, and B.1.617.3?

Thanks, and sorry to complain about this. I just don't know any other way to get an accurate list of the mutations I'm looking for.

If there were an all-lineage version of the 21L tree, that would be incredible and would forever end the troubles I have with these things. But I'm guessing that would be an enormous undertaking and isn't likely to ever exist.