Open astheeggeggs opened 11 months ago
This is scaling the mutation rates per site according to the number of distinct alleles, right? I think it is done now. And the way it is done now is to look at the reference and query haplotypes and then count the number of unique alleles at each site.
As it stands tskit's implementation does not allow for differential emission probs conditional on number of alleles.
There are choices here, all of which should be incorporated.
This is encoded in lshmm here:
https://github.com/astheeggeggs/lshmm/blob/792c74bd9474deef55418354fcb4b86ab9c19338/lshmm/api.py#L168C8-L168C8
with warnings thrown if the user doesn't conform to the defaults.