recapitation root time auto-adjust fails with root times > 1e5, resulting in an `msprime.InputError`

tskit-dev / pyslim

Tools for dealing with tree sequences coming to and from SLiM.

MIT License

26 stars 23 forks source link

recapitation root time auto-adjust fails with root times > 1e5, resulting in an `msprime.InputError` #321

Closed petrelharp closed 11 months ago

petrelharp commented 11 months ago

The fix to the last bug has introduced a new one: at this line we check whether the time all the roots are at is "equal" to the value recorded in metadata for the total simulation time minus 0, minus 1, or minus 2. However, we check "equal" with np.allclose( ), which has by default rtol=1e-5, meaning that, for instance, np.allclose(1e5, 1e5+1) is True. =( =( =(

This can produce the error (as reported by Meaghan Clark on the SLiM list):

msprime._msprime.InputError: Input error in initialise: Attempt to sample a lineage from an inactive population

Happily, this would not cause an unwanted bottleneck; it only produces the error if things go wrong (and if there's no error, everything is fine).

petrelharp commented 11 months ago

As a workaround, users can use the following code modified from the source for pyslim.recapitate:

ancestral_Ne = 1000
root_times = list(set([ts.node(n).time for t in ts.trees() for n in t.roots]))
assert len(root_times) == 1
recap_time = root_times[0]
demography = msprime.Demography.from_tree_sequence(ts)
for pop in demography.populations:
    pop.initial_size=1.0
ancestral_name = "ancestral"
derived_names = [pop.name for pop in demography.populations]
demography.add_population(
    name=ancestral_name,
    description="ancestral population simulated by msprime",
    initial_size=ancestral_Ne,
)
demography.add_population_split(
    np.nextafter( recap_time, 2 * recap_time),
    derived=derived_names,
    ancestral=ancestral_name,
)

recap = msprime.sim_ancestry(
    initial_state = ts,
    demography=demography,
    recombination_rate=1e-8 # AND OTHER ARGUMENTS HERE
)

petrelharp commented 11 months ago

Note that msprime will only notice the discrepancy (and thus throw the error) if the tree sequence isn't totally coalesced.