astheeggeggs / lshmm

code to run Li and Stephens
MIT License
3 stars 3 forks source link

Refactor how mutation probability is set when `prob_mutation` is set to `None` #103

Closed szhan closed 2 weeks ago

szhan commented 2 weeks ago

It should be a separate function in core.py rather than being embedded in set_emission_probabilities in api.py. It makes it easier to do testing with it.

szhan commented 2 weeks ago

In the LS paper (equations A2 and A3 in the appendix), k is defined as the number of sample haplotypes. What happens when we include partial ancestral haplotypes in the ref. panel? Should k be the number of sample haplotypes or all haplotypes? I think k should be the number of sample haplotypes, because it is related to a measure of genetic diversity. What do you think, @astheeggeggs?

szhan commented 2 weeks ago

For now, I'll just set k to be the number of all haplotypes (including ancestors) when getting estimates of the mutation probability. Otherwise, we have to make more changes to the code.