Open jeromekelleher opened 3 years ago
Which mutation do we keep?
Any mutation where derived_state != previous state.
But - I'm not sure we even want to provide this method? The only semi-legit use case I can think of is if someone simulates from some strange model where there's lots and lots of silent mutations, and wants to remove them for efficiency. For sure people might think they want it in other situations, but I'm not convinced.
Which mutation do we keep? Any mutation where derived_state != previous state.
Suppose we have a chain of mutations A -> A -> .... -> A. Do we keep the first or the last one?
The method is pretty low-priority for me too, just opening this issue as a way to track the discussion.
Suppose we have a chain of mutations A -> A -> .... -> A. Do we keep the first or the last one?
If the first A is the ancestral state, then we keep none of them! If it's T -> A -> A -> ... then we keep only the first one.
As discussed in an msprime issue (https://github.com/tskit-dev/msprime/pull/1548#issuecomment-801165185) it would be useful to have a method to remove silent mutations.
Snags:
There is also possibility of adding this to the canonicalise operation, but on reflection I think maybe not (mostly because of metadata question)