ashander / ftprime

Forward-time simulation of the msprime data structure (for development)
2 stars 1 forks source link

ftprime

An earlier version of this code was mentioned during the Evolution 2017 talk:

Ashander, McCartney-Melstad, Ralph, Shaffer (2017) "Using Genomic Data to Inform Population Viability in a Long-Lived Endangered Vertebrate".

Note the code is pre-production but if you're building on the ideas here please cite this repository using the DOI: DOI

The current version of this package (on HEAD) works with msprime version 0.6.1 (current as of Oct 17 2018). For the earlier version, used in the paper, get tag 0.0.6.

Contents

The purpose of this package is to provide python code to easily store ancestry information from a forwards-time, individual-based simulation, using msprime's "tree sequence" data format, so that after the simulation, we can

  1. have the entire tree sequence (equivalent to the ARG) of the final generation:w
  2. put down neutral mutations on the tree sequence afterwards without carrying them along in the simulation, and
  3. use msprime to efficiently store results and quickly compute statistics.

We also provide a class to facilitate doing this with simuPOP.

Core module:

Tests:

Development

Test status and code coverage:

CircleCI Coverage Status codecov

Clone:

git clone https://github.com/ashander/ftprime.git
cd ftprime

For best results, use miniconda, which provides the command line dependency manager conda. Once you have it installed, make a new environment to do development:

conda config --add channels conda-forge
conda env create -f environment.yml -n ftprime python=3.5
source activate ftprime  # Enter the development environment

Install tortoisim in locally editable (-e) mode and run the tests. After the pip command you should see a bunch of messages about requirements already satisfied (because you've installed them with conda, above):

pip install -e .[dev]  # Don't need the [dev] if you used conda above
pytest

Earlier work

Legacy interface:

Documentation of the problem and the methods:

Since there are many different ways to store an ARG as a set of coalescence records, a good deal of this is devoted to describing and verifying msprime's requirements for such a set, and thinking about different ways to do it.