Closed szhan closed 2 weeks ago
In the LS paper (equations A2 and A3 in the appendix), k
is defined as the number of sample haplotypes. What happens when we include partial ancestral haplotypes in the ref. panel? Should k
be the number of sample haplotypes or all haplotypes? I think k
should be the number of sample haplotypes, because it is related to a measure of genetic diversity. What do you think, @astheeggeggs?
For now, I'll just set k
to be the number of all haplotypes (including ancestors) when getting estimates of the mutation probability. Otherwise, we have to make more changes to the code.
It should be a separate function in
core.py
rather than being embedded inset_emission_probabilities
inapi.py
. It makes it easier to do testing with it.