Open hyanwong opened 2 years ago
@szhan - you might like to have a go to see if build_simulated_ancestors()
even works for you. Until we figure out https://github.com/tskit-dev/tsinfer/issues/11, you'll need to test it on pure SMC tree sequences, though (run msprime with model="smc"). If you do try it, report back in this issue if you have problems.
For testing purposes, it can be useful to run an inference with "perfect ancestors". For example, I suggested this as a route to @szhan to see if it is mainly the ancestor generation step in
tsinfer
that is causing problems for imputation accuracy.We can build perfect ancestors using
build_simulated_ancestors
in eval_util.py. It might be useful to document and possibly expose this function, and give an example use-case?For testing purposed, It would also be nice to extend the function to create slightly longer ancestors, filling in some of the flanking regions using the normal ancestor builder. This would be a way to ensure that we weren't unintentionally giving the algorithm any extra clues as to the position of breakpoints.