Closed AngieHinrichs closed 3 years ago
When I run scorpio on the 253 pango-designation representative sequences for B.1.526 as described in #22 with the modified cB.1.526.json in this branch, there is still one representative sequence that is not identified as cB.1.526:
grep -v True B.1.526.reps.scorpio.report.txt.Iota__B.1.526-like__counts.csv
query,ref_count,alt_count,ambig_count,other_count,rule_count,support,conflict,call
USA/CT-JAX-JAX000283/2021,8,15,0,0,1,0.652200,0.347800,False
Increasing max_ref to 8 instead of 7 would cover that one as well, but CT-JAX-JAX000283 is a little odd... doesn't fall neatly into either branch, and I kind of wonder if it might have a bit of contamination but I don't know how to evaluate that properly.
Looks very sensible. I will work on a fix to allow multiple constellation files for the same VOC/VUI as would longer term be better to have precise definitions for each just in case.
Hi @AngieHinrichs . Thanks for bringing this up. I was looking at similar issues and opened a related issue on the pangolin issue tracker. https://github.com/cov-lineages/pangolin/issues/305
I think the issue is more widespread that B.1.526, but that lineage is affected a lot.
Thanks - I'll investigate!
B.1.526/Iota is split into two major branches, one with {S:E484K, S:A701V, N:P199L, N:M234I} and the other with {S:S477N, S:Q957R, N:P13L, N:S202R}. cB.1.526.json included mutations for the first branch but not the second. Even pango-designation representative sequences for B.1.526 in the second branch were failing the max_ref=3 threshold due to having 4 reference alleles for first branch's mutations. To cover both branches, this change adds four mutations for the second branch and increases max_ref to 7.
Closes #22.