ivan-krukov / aligning-genealogies

The genealogy-coalescent alignment project
3 stars 0 forks source link

Link with msprime simulations, so that we can directly simulate+ evaluate #3

Closed sgravel closed 4 years ago

shz9 commented 4 years ago

I added a couple of methods to the Pedigree class to allow us to generate coalescent simulations on top of a particular pedigree. The method generate_msprime_simulations now provides an interface to generate simulations with the following parameters:

def generate_msprime_simulations(self,
                                 Ne=100,
                                 model='wf_ped',
                                 mu=1e-8,
                                 length=1e6,
                                 rho=1e-8):

Implementation follows example in Dom's repo. However, I keep getting the following error when trying to simulate with the genealogies we have:

LibraryError: The simulation model supplied resulted in a parent node having 
a time value <= to its child. This can occur either as a result of 
multiple bottlenecks happening at the same time or because of 
numerical imprecision with very small population sizes.

Any ideas what could be the issue? @DomNelson, @ivan-krukov

ivan-krukov commented 4 years ago

I think the issue is that my simulation and dom's code understand time in different directions. Shouldn't be too hard to fix.

shz9 commented 4 years ago

I fixed the direction of time in the from_founders method. But now I either get the error above or this error message:

~/opt/miniconda3/lib/python3.7/site-packages/msprime/simulations.py in simulate(sample_size, Ne, length, recombination_rate, recombination_map, mutation_rate, population_configurations, pedigree, migration_matrix, demographic_events, samples, model, record_migrations, random_seed, mutation_generator, num_replicates, replicate_index, from_ts, start_time, end_time, record_full_arg, num_labels, record_provenance, gene_conversion_rate, gene_conversion_track_length)
    570             sim, mutation_generator, replicate_index + 1, provenance_dict, end_time)
    571         # Return the last element of the iterator
--> 572         ts = next(iterator)
    573         for ts in iterator:
    574             continue

~/opt/miniconda3/lib/python3.7/site-packages/msprime/simulations.py in _replicate_generator(sim, mutation_generator, num_replicates, provenance_dict, end_time)
    172 
    173     for j in range(num_replicates):
--> 174         sim.run(end_time)
    175         replicate_provenance = None
    176         if encoded_provenance is not None:

~/opt/miniconda3/lib/python3.7/site-packages/msprime/simulations.py in run(self, end_time)
    869         end_time = np.inf if end_time is None else end_time
    870         logger.debug("Running simulation until maximum: %f", end_time)
--> 871         self.ll_sim.run(end_time)
    872         self.ll_sim.finalise_tables()
    873 

LibraryError: Bad simulator state. Initialise or reset must be called.

I got this error with Dom's pedigree file as well (data/wf_ped_1000probands_0_2_growthrate.txt). Any ideas what could be going on?

ivan-krukov commented 4 years ago

This should be resolved in bc1fc0cec. Simulating with Pedigree.simulate_from_founders should work. Looking into getting the balsac input and Dom's wf sim input working

ivan-krukov commented 4 years ago

Round-trip to and from BALSAC tables implemented in 2731971