yatisht / usher

Ultrafast Sample Placement on Existing Trees
MIT License
120 stars 40 forks source link

Ignore #360

Closed corneliusroemer closed 7 months ago

corneliusroemer commented 7 months ago

I take it back, the problem is that both 478 and 572I appear super homoplasically in XBB.1.22.1. There doesn't seem to be an obvious truth.

What I wrote but now don't think is true I noticed a lot of apparent homoplasy of S:T572I in FY.5 in the Usher SC2 tree and thought I'll investigate: image In the real sequences (below), there's almost no evidence of 6091T with 572I (what is colored red in the tree above). Almost all the red sequences above should go into the yellow subtree. Below is a tree made by Nextclade, simple greedy parsimony based addition without any regrafting, showing nice separation: those sequences with 6091T don't have 572I and vice versa: image I'm not sure why Usher puts things the way it is. Maybe the early sequences had a lot of Ns, and when better ones came around, imputation had dug itself into the wrong valley and can't get out. Maybe a remove and replace could help here. This particular case isn't a huge issue, but I thought it's good to collect evidence on these sort of very obvious errors in the tree that should be quite easy to detect using simple algorithms and hence in theory allow improvement of the software in the future.