Closed FedeGueli closed 1 year ago
Looks like it's worth designating the outer one with 29 sequences as BV.X
Let's designate the outer one with S:444N as BV.3
One more sequence total now should be 30 but i revised the little cluster with S:460K they are two not three, so 29 is correct.
As noted in #1089 cc @AngieHinrichs: it seems that this entire sublineage is now placed under BA.5.2.24: https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice1_genome_8344_2f24c0.json?branchLabel=nuc%20mutations&c=gt-nuc_23707&label=nuc%20mutations:G22894T
and it would count 57 sequences as today
This may well still be a BA.5.2.20, what makes you think it's BA.5.2.24? This looks like homoplasy tree builder error?
I don't think it's necessarily an error, just an ambiguous situation.
BA.5.2.20 is larger than BA.5.2.24 (~12k vs. ~280 excluding the new cluster) so arguably BA.5.2.20 is the more likely parent, assuming even sampling rates. One could also look at dates & geographical locations to try to determine which is more likely, although it's often still ambiguous. I thought usher/matOptimize had a tiebreaker based on number of descendants so I don't understand why matOptimize would move the cluster from BA.5.2.20 to BA.5.2.24.
Whichever you end up choosing between .20 and .24, it shouldn't affect lineage assignment for new sequences because the set of mutations in the new cluster is unambiguous.
This one is actually already designated as BV.2! For some reasons it seems to be missing from Usher? @AngieHinrichs
I had made a mistake and forgotten to add the sequences in the designation commit, adding them as a fix later.
This one is actually already designated as BV.2! For some reasons it seems to be missing from Usher?
Well crud, I just missed it. I need to automate checking that I have manually annotated all of the multi-letter lineages.
BV.2 missed the boat for pangolin-data v1.15.1 usher-mode (except the ones that will be caught by the designation hash). Looks like the sequences are assigned BA.5.2.24 (with S:K444N so at least that's flagged) by pangolin in usher mode.
So be it!
You could try to use this csv to verify the latest lineages are all included, should be doable with a simple script?
Something else that may help: is it possible to disable designation from designation hash? It would be a simple test to check whether the designated sequences are correctly assigned with designation hash off :)
On 12 Oct 2022, at 00:57, Angie Hinrichs @.***> wrote:
This one is actually already designated as BV.2! For some reasons it seems to be missing from Usher?
Well crud, I just missed it. I need to automate checking that I have manually annotated all of the multi-letter lineages.
BV.2 missed the boat for pangolin-data v1.15.1 usher-mode (except the ones that will be caught by the designation hash). Looks like the sequences are assigned BA.5.2.24 (with S:K444N so at least that's flagged) by pangolin in usher mode.
— Reply to this email directly, view it on GitHub https://github.com/cov-lineages/pango-designation/issues/1122#issuecomment-1275375370, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF77AQOJE5EDMMBE2GGBH2TWCXWFTANCNFSM6AAAAAAQV4I5IM. You are receiving this because you modified the open/close state.
Yes, simple script, and yes, pangolin has a --skip-designation-cache
option.
SEE LAST COMMENT USHER TREE IS CHANGED NOW (9/10/22)
Here i want to propose a sublineage of the recently designated BA.5.2.20 lineage (that has Orf1b:1050N).
It is defined by S:K444N mutation, it stems out directly after the BA.5.2.20 defining NUC :C23707T .
it counts 29 sequences as today from 11 countries and 5 continents https://nextstrain.org/fetch/genome.ucsc.edu/trash/ct/subtreeAuspice7_genome_14417_1ac6e0.json?branchLabel=aa%20mutations&c=pango_lineage_usher&label=nuc%20mutations:G22894T
Covspectrum query: BA.5.2 (Nextclade) + G12310A, C23707T, C14649C, C11704C + S:K444N https://cov-spectrum.org/explore/World/AllSamples/Past6M/variants?aaMutations=S%3AK444N&nucMutations=G12310A%2CC23707T%2C14649C%2C11704C&nextcladePangoLineage=BA.5.2&aaMutations1=S%3A153I%2CS%3A1258Q%2CN%3A151L&
Sequence list: contributors.csv
a little branch with 3 sequences have acquired S:N460K (T22942A) , N:S327L, Orf1a:K669N
Gisaid query for this last little cluster: Spike_N460K, N_S327L,Spike_K444N