@aineniamh I've been thinking more about usher and scorpio -- I think it's still valuable to know whether scorpio agrees with the usher call, and in case there's some future situation in which we find scorpio is doing better than usher, it would be nice to have something to grep for in the note column for a workaround.
So how about this: in usher mode, we still run scorpio (unless --skip-scorpio is given), and still report conflicts between scorpio and usher in the note column, but don't override the usher call. ?
To test this, I ran the modified pangolin on GISAID sequences with EPIISL IDs 7000000-7009999 (Delta, to get some examples of incompatible lineage calls) and 13000000-13009999 (recent Omicron, to get some examples of BA.4, BA.5 and recombinant calls that were overridden due to being called Omicron/Unassigned or BA.2 by scorpio). Then I looked at the placement in the big UShER tree of sequences that had disagreements (all of the "conflicts with", not all of the many Omicron/Unassigned), to make sure that the mutations in the sequences looked reasonable for their usher assignments. Some example disagreements:
@aineniamh I've been thinking more about usher and scorpio -- I think it's still valuable to know whether scorpio agrees with the usher call, and in case there's some future situation in which we find scorpio is doing better than usher, it would be nice to have something to grep for in the
note
column for a workaround.So how about this: in usher mode, we still run scorpio (unless
--skip-scorpio
is given), and still report conflicts between scorpio and usher in thenote
column, but don't override the usher call. ?To test this, I ran the modified
pangolin
on GISAID sequences with EPIISL IDs 7000000-7009999 (Delta, to get some examples of incompatible lineage calls) and 13000000-13009999 (recent Omicron, to get some examples of BA.4, BA.5 and recombinant calls that were overridden due to being called Omicron/Unassigned or BA.2 by scorpio). Then I looked at the placement in the big UShER tree of sequences that had disagreements (all of the "conflicts with", not all of the many Omicron/Unassigned), to make sure that the mutations in the sequences looked reasonable for their usher assignments. Some example disagreements: