tskit-dev / pyslim

Tools for dealing with tree sequences coming to and from SLiM.
MIT License
27 stars 23 forks source link

passing a rate map to `recombination_rate` does strange things #249

Closed petrelharp closed 2 years ago

petrelharp commented 2 years ago

In #248 it was pointed out that if we run this SLiM script:

initialize() {
    initializeSLiMModelType("nonWF");
    initializeSLiMOptions(keepPedigrees = T);
    initializeTreeSeq();
    initializeSex("A");

    initializeMutationType("m1", 0.5, "f", 0.0); // codominant, neutral loci.
    initializeGenomicElementType("g1", m1, 1.0); // all genomic elements are of type m1.
    initializeGenomicElement(g1, 0, 199999999); // a moderately large "chromosome" of 0.2 Gb.
    initializeMutationRate(0.0);
    initializeRecombinationRate(c(1e-8, 0.5, 1e-8, 0.5, 1e-8, 0.5, 1e-8, 0.5, 1e-8), 
c(40000000, 40000001, 80000000, 80000001, 120000000, 120000001, 160000000, 160000001, 199999999)); 
}
1 { sim.addSubpop("p1", 500); }
10 late() { sim.treeSeqOutput("test.trees"); }

and then this python script:

import math
import msprime
import pyslim

ts = pyslim.load("test.trees")
r_chrom = 1.0e-8
r_break = math.log(2)
chrom_positions = [0, 40000000, 80000000, 120000000, 160000000, 200000000]

map_positions = [chrom_positions[0], chrom_positions[1], chrom_positions[1] + 1, chrom_positions[2], 
    chrom_positions[2] + 1, chrom_positions[3], chrom_positions[3] + 1, chrom_positions[4], 
    chrom_positions[4] + 1, chrom_positions[5]]

rates = [r_chrom, r_break, r_chrom, r_break, r_chrom, r_break, r_chrom, r_break, r_chrom]
rate_map = msprime.RateMap(position=map_positions, rate=rates)
print('left', rate_map.left)
print('missing', rate_map.missing)

rts = ts.recapitate(recombination_map=rate_map, Ne=22850, random_seed=1) 
rts = pyslim.recapitate(ts, recombination_rate=rate_map, ancestral_Ne=228, random_seed=1) 

we get this error:

/home/peter/.local/lib/python3.9/site-packages/pyslim/slim_tree_sequence.py:30: FutureWarning: The SlimTreeSequence class is being phased out, as most important functionality is provided by tskit. Please see the `documentation <https://tskit.dev/pyslim/latest/previous_versions.html>`_. Please use pyslim.recapitate( ) instead.
  warnings.warn(
out: (0, 2)
Traceback (most recent call last):
  File "/home/peter/projects/tskit-dev/msprime/test.py", line 17, in <module>
    rts = ts.recapitate(recombination_rate=rate_map, Ne=22850, random_seed=1) 
  File "/home/peter/.local/lib/python3.9/site-packages/pyslim/slim_tree_sequence.py", line 451, in recapitate
    recap = msprime.simulate(
  File "/home/peter/projects/tskit-dev/msprime/msprime/ancestry.py", line 609, in simulate
    sim = _parse_simulate(
  File "/home/peter/projects/tskit-dev/msprime/msprime/ancestry.py", line 358, in _parse_simulate
    sim = Simulator(
  File "/home/peter/projects/tskit-dev/msprime/msprime/ancestry.py", line 1307, in __init__
    ll_recomb_map = self._resolve_missing_intervals(recombination_map)
  File "/home/peter/projects/tskit-dev/msprime/msprime/ancestry.py", line 1395, in _resolve_missing_intervals
    self.missing_intervals = recombination_map.missing_intervals()
  File "/home/peter/projects/tskit-dev/msprime/msprime/intervals.py", line 288, in missing_intervals
    out[:, 0] = self.left[self.missing]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

After sticking in some print statements, it turns out this is because the RateMap that msprime gets handed by pyslim has, for some reason, self.missing equal to

[[False False False False False False False False False]]

instead of

[[False False False False False False False False False]]

like it should. This is presumably because in pyslim.SlimTreeSequence.recapitate( ), passing in a RateMap to recombination_rate then uses that passed in RateMap to create a new rate map, as the recombination rate.

bhaller commented 2 years ago

It appears to say [[False False False False False False False False False]] in both places above, so... :->

mufernando commented 2 years ago

I'm not sure what the issue here is. I ran these SLiM and python script with SLiM v4, most recent pyslim and tskit (now loading the tree sequence with tskit). pyslim.recapitate worked just fine. The issue was with the old SlimTreeSequence.recapitate method that does not exist any more.

petrelharp commented 2 years ago

Yep; confirmed - works for me too. I'm going to close this; please re-open if it's still a problem!