ivan-krukov / aligning-genealogies

The genealogy-coalescent alignment project
3 stars 0 forks source link

Simulate with example BALSAC data #8

Closed ivan-krukov closed 4 years ago

ivan-krukov commented 4 years ago

We want to know how much relatedness information a genealogy contains. We have a sample BALSAC genealogy with 140 probands. One way to answer this is to count the number of coalescent events within a genealogy. This can either be done exhaustively or through simulation.

We can simulate a gene genealogy on a pedigree using Dom's fork of msprime. There is a simple example in: msprime_example.py and there is a more complete example in Dom's repository: pedigree_simulate.py.

The output of simulation is an tskit tree sequence. For every tree in the sequence, we want to know the number of coalescent events. The more coalescent events the better. We might also be interested in the distribution of coalescent events per generation.

andjelatodorovic commented 4 years ago

I keep getting an error saying msprime module does not have a Pedigree attribute. I suppose it’s just a plain import error. Did anyone face the same issue?

shz9 commented 4 years ago

The main branch of msprime doesn't have the Pedigree class defined yet. You need to install the one from Dom's branch (you can do it in a virtual env if you wish):

pip install newick
pip install git+https://github.com/DomNelson/msprime.git#egg=msprime