tskit-dev / pyslim

Tools for dealing with tree sequences coming to and from SLiM.
MIT License
27 stars 23 forks source link

Recapitation for complex demography with populations that died out #343

Closed Luker121 closed 3 months ago

Luker121 commented 4 months ago

Hi,

I have a slim script, that simulates a complex demography with several splits and new establishment of metapopulations. I have attached the slim script in a .txt file: slim_contModel_example.txt

I create an ancestral metapopulation under an island model with a given migration rate (mMeta). Here I give a short code snippet for this:

1 early() {
    // Create the ancestral population

    //set metapop ancestral    
    for (i in 1:5)
        sim.addSubpop(i, NAnc);

    p1.setMigrationRates(c(p2,p4,p5), c(mMeta,mMeta,mMeta));
    p2.setMigrationRates(c(p1, p3), c(mMeta, mMeta));
    p3.setMigrationRates(c(p2, p4), c(mMeta, mMeta));
    p4.setMigrationRates(c(p3, p5), c(mMeta, mMeta));
    p5.setMigrationRates(c(p3,p4,p1), c(mMeta, mMeta, mMeta));

    // add selfing rate 
    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

2 early() {
        cat("Current generation: " + sim.cycle + "\n");
    community.rescheduleScriptBlock(s1, start=start_splitAnc, end=start_splitAnc);
    community.rescheduleScriptBlock(s2, start=start_splitAnc+1, end=start_splitAnc+1);
    community.rescheduleScriptBlock(s3, start=Split_Sagit_gen, end=Split_Sagit_gen);
    community.rescheduleScriptBlock(s4, start=Split_Sagit_gen+1, end=Split_Sagit_gen+1);
    community.rescheduleScriptBlock(s5, start=Split_Nemo_gen, end=Split_Nemo_gen);
    community.rescheduleScriptBlock(s6, start=Split_Nemo_gen+1, end=Split_Nemo_gen+1);
    community.rescheduleScriptBlock(s7, start=start_recGeneFlow, end=start_recGeneFlow);
    community.rescheduleScriptBlock(s8, start=today_gen, end=today_gen);
}

Later this ancestral metapopulation dies out by setting the population size to 0. And in the end, I will output the tree sequences from the whole demography.

// ---- EXTINCTION ANCESTRAL POP ----

s2 205 late() {
    //remove ancestral pop
    cat("Extinction ancestral pop" + "\n");
    p1.setSubpopulationSize(0);
    p2.setSubpopulationSize(0);
    p3.setSubpopulationSize(0);
    p4.setSubpopulationSize(0);
    p5.setSubpopulationSize(0);
}

.....

s8 1000 late() {

    // Create unique output directory for each replicate and seed
    output_dir = path + "/contMigmodel";
    if (!fileExists(output_dir)) {
        createDirectory(output_dir);
    }

    sim.treeSeqOutput(output_dir + "/param_set_" + param_set_id + "_seed_" + seed + "_replicate_" + replicate + "_contMigmodel" + "overlay.trees");

}

My current output in pyslim contains then these populations (for example), where only my current generation contains individuals, as the ancient ones died out:

Number of individuals in population 0: 0
Number of individuals in population 1: 0
Number of individuals in population 2: 0
Number of individuals in population 3: 0
Number of individuals in population 4: 0
Number of individuals in population 5: 0
Number of individuals in population 6: 0
Number of individuals in population 7: 0
Number of individuals in population 8: 0
Number of individuals in population 9: 0
Number of individuals in population 10: 0
Number of individuals in population 11: 0
Number of individuals in population 12: 0
Number of individuals in population 13: 0
Number of individuals in population 14: 0
Number of individuals in population 15: 0
Number of individuals in population 16: 1438
Number of individuals in population 17: 1438
Number of individuals in population 18: 1438
Number of individuals in population 19: 1438
Number of individuals in population 20: 1438
Number of individuals in population 21: 7683
Number of individuals in population 22: 163
Number of individuals in population 23: 163
Number of individuals in population 24: 163
Number of individuals in population 25: 163
Number of individuals in population 26: 163
Number of individuals in population 27: 6652

Now my question would be how the recapitation step looks like. Currently I have a code like this to simulate the ancient metapopulation (p1-p5) with their given migration rates:

import msprime
import tskit
import pyslim

orig_ts = tskit.load("./param_set_10000_seed_1179967342_replicate_1_contMigmodeloverlay.trees")
demography = msprime.Demography.from_tree_sequence(orig_ts)
for pop in demography.populations:
    # must set their effective population sizes
    pop.initial_size = 3831

demography.add_migration_rate_change(time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p1", dest="p2")
demography.add_migration_rate_change(time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p2", dest="p1")
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p2", dest="p3"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p3", dest="p2"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p3", dest="p4"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p4", dest="p3"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p4", dest="p5"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p5", dest="p4"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p5", dest="p1"
    )
demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p1", dest="p5"
    )

rts = pyslim.recapitate(
        orig_ts, demography=demography,
        recombination_rate=5.56e-8,
        random_seed=4
)

orig_max_roots = max(t.num_roots for t in orig_ts.trees())
recap_max_roots = max(t.num_roots for t in rts.trees())
print(f"Maximum number of roots before recapitation: {orig_max_roots}\n"
      f"After recapitation: {recap_max_roots}")

Error: KeyError: "Population with name 'p1' not found"

But this is giving me an error, as the ancient population is not existing anymore. So my question is how recapitation looks like when the tick 1 generation in slim died out during the SLiM simulation?

bhaller commented 4 months ago

Hi! I'll let @petrelharp tackle the main question, but for my part, I'll just note that the use of community.rescheduleScriptBlock() is probably no longer necessary. For example, if you have a block:

s1 10000 early() {
   ...
}

and then reschedule it with:

community.rescheduleScriptBlock(s1, start=start_splitAnc, end=start_splitAnc);

you can now just do:

start_splitAnc early() {
   ...
}

as long as the constant start_splitAnc exists by the end of initialize(). So that ought to simplify your script a bit; but it has nothing to do with your actual question, I realize. :->

petrelharp commented 4 months ago

Hm - what you write sounds correct. Here's my suggestions:

  1. find out what the population names actually are in the .trees file. You can do this by running in the shell

    tskit populations <name of .trees file>

    or, in python

    for pop in orig_ts.populations():
    print(pop)

    Perhaps that'll tell you what's going on. If not,

  2. Make a minimal working example: a script, as simple as possible (so, just two ancestral populations! etcetera) that does what you want to do here. If that doesn't work and you don't see why, post it here. If it does, try to see what's different to your script.

Luker121 commented 3 months ago

Hi, thanks a lot for the helpful suggestions. When running tskit populations <treefile> I got this result:

id  metadata
0   None
1   None
2   None
3   None
4   None
5   None
6   {'name': 'p6'}
7   {'name': 'p7'}
8   {'name': 'p8'}
9   {'name': 'p9'}
10  {'name': 'p10'}
11  {'name': 'p11'}
12  {'name': 'p12'}
13  {'name': 'p13'}
14  {'name': 'p14'}
15  {'name': 'p15'}
16  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 17}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 18}, {'migration_rate': 3.58901533768985e-09, 'source_subpop': 21}], 'name': 'p16', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 16}
17  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 16}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 18}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 19}], 'name': 'p17', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 17}
18  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 16}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 17}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 19}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 20}], 'name': 'p18', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 18}
19  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 17}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 18}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 20}], 'name': 'p19', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 19}
20  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 18}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 19}], 'name': 'p20', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 20}
21  {'migration_records': [{'migration_rate': 3.58901533768985e-09, 'source_subpop': 16}, {'migration_rate': 1.88094866844212e-09, 'source_subpop': 27}], 'name': 'p21', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 21}
22  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 23}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 24}, {'migration_rate': 3.55621586337778e-08, 'source_subpop': 27}], 'name': 'p22', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 22}
23  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 22}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 24}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 25}], 'name': 'p23', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 23}
24  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 22}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 23}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 25}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 26}], 'name': 'p24', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 24}
25  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 23}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 24}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 26}], 'name': 'p25', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 25}
26  {'migration_records': [{'migration_rate': 8.971623161226e-07, 'source_subpop': 24}, {'migration_rate': 8.971623161226e-07, 'source_subpop': 25}], 'name': 'p26', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 26}
27  {'migration_records': [{'migration_rate': 1.88094866844212e-09, 'source_subpop': 21}, {'migration_rate': 3.55621586337778e-08, 'source_subpop': 22}], 'name': 'p27', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 27}

And actually it is kind of a surprising result as populations 1-5 are apparently not present within the .trees file:

1   None
2   None
3   None
4   None
5   None

Therefore, I believe there is some issue with the SLiM script where only populations 6-27 are outputted to the .trees file, despite populations 1-5 being added at the beginning and confirmed in SLiMgui. Within the SLiM script I let populations 1-15 reach a size of zero, indicating extinction. However, the initial populations (1-5) are not recorded in the output file. I have repeatedly checked the script and the SLiMgui runs but cannot determine the cause of this issue. @bhaller, could you provide some insight into why this might be happening? Below is the SLiM script for reference:

initialize() {

    if(exists("slimgui")){
        seed = getSeed();
            setSeed(seed);
        setwd("/path/");
        path = getwd();
        defineGlobal("path", path);
        defineGlobal("seed", seed);
        defineGlobal("replicate", 1);
        defineConstant("NAnc", 17147);
        defineConstant("mMeta", 6.58977e-10);
        defineConstant("SelfRate", 0.9);
        defineConstant("mAncNS", 0);
    defineConstant("NAncS", 4110);
    defineConstant("NAncN", 2091);
    defineConstant("mS", 0.000172734);
    defineConstant("NsagSym", 5600);
    defineConstant("NsagAllo", 264);
    defineConstant("mN", 2.07402e-11);
    defineConstant("NNemoSym", 3666);
    defineConstant("NNemoAllo", 3305);

        // ----- TIME PARAMETERS -----

    defineConstant("start_splitAnc", 10); 
    defineConstant("today_delay", 581);
    defineConstant("Split_Sagit_delay", 74134);
    defineConstant("Split_Sagit_gen", start_splitAnc + Split_Sagit_delay - today_delay);
    defineConstant("Split_Nemo_delay", 72915);
    defineConstant("Split_Nemo_gen", start_splitAnc + Split_Nemo_delay - today_delay);

    if (Split_Nemo_gen >= Split_Sagit_gen) {
       defineConstant("today_gen", Split_Nemo_gen + today_delay);
    }

    else {
       defineConstant("today_gen", Split_Sagit_gen + today_delay);
        }

    } else {

        //setwd("/Users/luki/Dropbox/phd/SLIM/");
        setwd("/data/home/students/l.metzger/Documents/Arabis/SLIM/simulation_output/");
        path = getwd();
        defineGlobal("path", path);
        setSeed(seed);
        defineConstant("scaling_factor", 10);

        // ----- TIME PARAMETERS -----

    defineConstant("start_splitAnc", 10);
    defineConstant("Split_Sagit_gen", start_splitAnc + Split_Sagit_delay - today_delay);
    defineConstant("Split_Nemo_gen", start_splitAnc + Split_Nemo_delay - today_delay);

    if (Split_Nemo_gen >= Split_Sagit_gen) {
       defineConstant("today_gen", Split_Nemo_gen + today_delay);
    }

    else {
       defineConstant("today_gen", Split_Sagit_gen + today_delay);
        }

    } // end defining constants

    // ----- Ancient meta pops -----
    defineConstant("COUNTAnc", 5);
        defineConstant("start", clock());

    // ----- TREE-SEQUENCE RECORDING ----- 
    initializeTreeSeq(); 

    // set the overall mutation rate map
    initializeMutationRate(0);

    initializeMutationType("m1", 0.5, "f", 0.0); // Neutral mutations
    initializeGenomicElementType("g1", m1, 1.0);
    initializeGenomicElement(g1, 0, 1000000);

    //constant recombination along the chromosome:
    initializeRecombinationRate((5.56*1e-8)*10);
}

1 early() {
    // Create the ancestral population

    //set metapop ancestral    
    for (i in 1:COUNTAnc)
        sim.addSubpop(i, NAnc);

    p1.setMigrationRates(c(p2,p4,p5), c(mMeta,mMeta,mMeta));
    p2.setMigrationRates(c(p1, p3), c(mMeta, mMeta));
    p3.setMigrationRates(c(p2, p4), c(mMeta, mMeta));
    p4.setMigrationRates(c(p3, p5), c(mMeta, mMeta));
    p5.setMigrationRates(c(p3,p4,p1), c(mMeta, mMeta, mMeta));

    // add selfing rate 
    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

start_splitAnc early() {
    //split nemoAnc and sagitAnc from the ancestral pop
    cat("split nemoAnc and sagitAnc" + "\n");
        cat("Current generation: " + sim.cycle + "\n");

    sim.addSubpopSplit("p6", NAncS, p5);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p11", NAncN, p5);

    p6.setMigrationRates(p11, mAncNS);
    p11.setMigrationRates(p6, mAncNS);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p7", NAncS, p6);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p12", NAncN, p11);

    p7.setMigrationRates(p6, mMeta);
    p6.setMigrationRates(p7, mMeta);

    p12.setMigrationRates(p11, mMeta);
    p11.setMigrationRates(p12, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p8", NAncS, p7);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p13", NAncN, p12);

    p7.setMigrationRates(p8, mMeta);
    p8.setMigrationRates(p7, mMeta);

    p12.setMigrationRates(p13, mMeta);
    p13.setMigrationRates(p12, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p9", NAncS, p8);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p14", NAncN, p13);

    p9.setMigrationRates(p8, mMeta);
    p8.setMigrationRates(p9, mMeta);

    p14.setMigrationRates(p13, mMeta);
    p13.setMigrationRates(p14, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p10", NAncS, p9);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p15", NAncN, p14);

    p9.setMigrationRates(p10, mMeta);
    p10.setMigrationRates(p9, mMeta);

    p14.setMigrationRates(p15, mMeta);
    p15.setMigrationRates(p14, mMeta);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

    p6.setMigrationRates(p8, mMeta);
    p8.setMigrationRates(p6, mMeta);
    p7.setMigrationRates(p9, mMeta);
    p9.setMigrationRates(p7, mMeta);
    p10.setMigrationRates(p8, mMeta);
    p8.setMigrationRates(p10, mMeta);

    p11.setMigrationRates(p13, mMeta);
    p13.setMigrationRates(p11, mMeta);
    p12.setMigrationRates(p14, mMeta);
    p14.setMigrationRates(p12, mMeta);
    p13.setMigrationRates(p15, mMeta);
    p15.setMigrationRates(p13, mMeta);

}

// ---- EXTINCTION ANCESTRAL POP ----

start_splitAnc+1 early() {
    //remove ancestral pop
    cat("Extinction ancestral pop" + "\n");
    p1.setSubpopulationSize(0);
    p2.setSubpopulationSize(0);
    p3.setSubpopulationSize(0);
    p4.setSubpopulationSize(0);
    p5.setSubpopulationSize(0);
}

Split_Sagit_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p16", NsagAllo, p6);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p21", NsagSym, p6);
    p16.setMigrationRates(p21, mS);
    p21.setMigrationRates(p16, mS);

    //split one pop from the ancestral pop
    sim.addSubpopSplit("p17", NsagAllo, p16);
    p17.setMigrationRates(p16, mMeta);
    p16.setMigrationRates(p17, mMeta);

    //split one pop from the ancestral pop
    sim.addSubpopSplit("p18", NsagAllo, p17);

    p17.setMigrationRates(p18, mMeta);
    p18.setMigrationRates(p17, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p19", NsagAllo, p18);
    p19.setMigrationRates(p18, mMeta);
    p18.setMigrationRates(p19, mMeta);

    //split one pop from the ancestral pop
    sim.addSubpopSplit("p20", NsagAllo, p19);

    p19.setMigrationRates(p20, mMeta);
    p20.setMigrationRates(p19, mMeta);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

    p16.setMigrationRates(p18, mMeta);
    p18.setMigrationRates(p16, mMeta);
    p17.setMigrationRates(p19, mMeta);
    p19.setMigrationRates(p17, mMeta);
    p20.setMigrationRates(p18, mMeta);
    p18.setMigrationRates(p20, mMeta);

}

Split_Sagit_gen+1 early() {
    p6.setSubpopulationSize(0);
    p7.setSubpopulationSize(0);
    p8.setSubpopulationSize(0);
    p9.setSubpopulationSize(0);
    p10.setSubpopulationSize(0);

}

Split_Nemo_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p22", NNemoAllo, p11);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p27", NNemoSym, p11);
    p27.setMigrationRates(p22, mN);
    p22.setMigrationRates(p27, mN);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p23", NNemoAllo, p22);
    p22.setMigrationRates(p23, mMeta);
    p23.setMigrationRates(p22, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p24", NNemoAllo, p23);
    p24.setMigrationRates(p23, mMeta);
    p23.setMigrationRates(p24, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p25", NNemoAllo, p24);
    p25.setMigrationRates(p24, mMeta);
    p24.setMigrationRates(p25, mMeta);

    //split one pop from the ancestral pop

    sim.addSubpopSplit("p26", NNemoAllo, p25);
    p25.setMigrationRates(p26, mMeta);
    p26.setMigrationRates(p25, mMeta);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

    p22.setMigrationRates(p24, mMeta);
    p24.setMigrationRates(p22, mMeta);
    p23.setMigrationRates(p25, mMeta);
    p25.setMigrationRates(p23, mMeta);
    p24.setMigrationRates(p26, mMeta);
    p26.setMigrationRates(p24, mMeta);

}

Split_Nemo_gen+1 early() {
    p11.setSubpopulationSize(0);
    p12.setSubpopulationSize(0);
    p13.setSubpopulationSize(0);
    p14.setSubpopulationSize(0);
    p15.setSubpopulationSize(0);
}

today_gen late() {

    // Create unique output directory for each replicate and seed
    output_dir = path + "/noMigmodel";
    if (!fileExists(output_dir)) {
        createDirectory(output_dir);
    }

    cat("Elapsed: " + (clock() - start));
    sim.treeSeqOutput(output_dir + "/param_set_" + param_set_id + "_seed_" + seed + "_replicate_" + replicate + "_noMigmodel" + "overlay.trees");

}
bhaller commented 3 months ago

I'm not sure offhand, perhaps @petrelharp has a better handle on this stuff?

Luker121 commented 3 months ago

ok, thanks a lot. Even when simplifying the model completely by only using a subset, then, again, the first population (p1) does not get transferred into the .trees file:

1 early() {
    // Create the ancestral population

    //set metapop ancestral    
    sim.addSubpop("p1", NAnc);

    // add selfing rate 
    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

start_splitAnc early() {

    sim.addSubpopSplit("p2", NAncS, p1);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p3", NAncN, p1);

    p2.setMigrationRates(p3, mAncNS);
    p3.setMigrationRates(p2, mAncNS);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

}

// ---- EXTINCTION ANCESTRAL POP ----

start_splitAnc+1 early() {
    //remove ancestral pop
    p1.setSubpopulationSize(0);
}

Split_Sagit_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p4", NsagAllo, p2);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p5", NsagSym, p2);

    p4.setMigrationRates(p5, mS);
    p5.setMigrationRates(p4, mS);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

}

Split_Nemo_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p6", NNemoAllo, p3);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p7", NNemoSym, p3);

    p6.setMigrationRates(p7, mN);
    p7.setMigrationRates(p6, mN);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

start_recGeneFlow early() {
    // Set migration rates for sympatric
    p2.setSubpopulationSize(0);
    p3.setSubpopulationSize(0);

}

today_gen late() {

    // Create unique output directory for each replicate and seed
    output_dir = path + "/recentMigmodel_onepop";
    if (!fileExists(output_dir)) {
        createDirectory(output_dir);
    }

    cat("Elapsed: " + (clock() - start));
    sim.treeSeqOutput(output_dir + "/param_set_" + param_set_id + "_seed_" + seed + "_replicate_" + replicate + "_onepop" + "_recentMigmodel" + "overlay.trees");

}
tskit populations file.trees
id  metadata
0   None
1   None
2   {'name': 'p2'}
3   {'name': 'p3'}
4   {'migration_records': [{'migration_rate': 5.67404280755485e-05, 'source_subpop': 5}], 'name': 'p4', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 4}
5   {'migration_records': [{'migration_rate': 5.67404280755485e-05, 'source_subpop': 4}, {'migration_rate': 6.8730267065557e-06, 'source_subpop': 7}], 'name': 'p5', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 5}
6   {'migration_records': [{'migration_rate': 5.00202864110265e-07, 'source_subpop': 7}], 'name': 'p6', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 6}
7   {'migration_records': [{'migration_rate': 6.8730267065557e-06, 'source_subpop': 5}, {'migration_rate': 5.00202864110265e-07, 'source_subpop': 6}], 'name': 'p7', 'selfing_fraction': 0.9, 'sex_ratio': 0.0, 'slim_id': 7}
petrelharp commented 3 months ago

Well, a population won't be put into the .trees file if there's no individuals in it, so my guess is that you've got no individuals in those populations at the end of the simulation. If that's not true, could you post a version of your script that I can run, so I can investigate? (The script should be self-contained, so I can just hit "run"; the script above is missing the initialize() block; pasting in the block from above runs into errors with setwd(); removing those turns up some other errors - I stopped fixing them after the fifth one.)

bhaller commented 3 months ago

Well, a population won't be put into the .trees file if there's no individuals in it, so my guess is that you've got no individuals in those populations at the end of the simulation. If that's not true, could you post a version of your script that I can run, so I can investigate? (The script should be self-contained, so I can just hit "run"; the script above is missing the initialize() block; pasting in the block from above runs into errors with setwd(); removing those turns up some other errors - I stopped fixing them after the fifth one.)

@petrelharp Given that SLiM keeps the first-generation individuals, if the subpop used to have individuals in it, should it then be present in the population table? Or are the first-gen individuals not kept by SLiM if the subpop doesn't exist at the end of the run? Or what? I'm not sure of this myself, which is why I haven't been replying very usefully to this thread. :->

Luker121 commented 3 months ago

Hi thanks for the fast reply. And I am sorry for not providing a useful script. Actually, yes, the p1 population does not have individuals at the end of the simulation, but population p2 and p3 neither have. But they are transferred and metnioned within the .trees file. In any case, my question would be, how to recapitate my script when I don't have individuals for my first generation? Should I just leave them "alive"? This script should do it now:

initialize() {

    if(exists("slimgui")){
        seed = getSeed();
       setSeed(seed);
        setwd("./");
        path = getwd();
        defineGlobal("path", path);
        defineConstant("seed", seed);
        defineConstant("replicate", 1);
                defineConstant("param_set_id", 1);
        defineConstant("NAnc", 17147);
        defineConstant("mMeta", 6.58977e-10);
        defineConstant("SelfRate", 0.9);
        defineConstant("mAncNS", 0);
    defineConstant("NAncS", 4110);
    defineConstant("NAncN", 2091);
    defineConstant("mS", 0.000172734);
    defineConstant("NsagSym", 5600);
    defineConstant("NsagAllo", 264);
    defineConstant("mN", 2.07402e-11);
    defineConstant("NNemoSym", 3666);
    defineConstant("NNemoAllo", 3305);

        // ----- TIME PARAMETERS -----

    defineConstant("start_splitAnc", 10); 
    defineConstant("today_delay", 581);
    defineConstant("Split_Sagit_delay", 74134);
    defineConstant("Split_Sagit_gen", start_splitAnc + Split_Sagit_delay - today_delay);
    defineConstant("Split_Nemo_delay", 72915);
    defineConstant("Split_Nemo_gen", start_splitAnc + Split_Nemo_delay - today_delay);

    if (Split_Nemo_gen >= Split_Sagit_gen) {
       defineConstant("today_gen", Split_Nemo_gen + today_delay);
       defineConstant("start_recGeneFlow", Split_Nemo_gen + 1);
    }

    else {
       defineConstant("today_gen", Split_Sagit_gen + today_delay);
       defineConstant("start_recGeneFlow", Split_Sagit_gen + 1);
        }

    } else {

        path = getwd();
        defineGlobal("path", path);
        setSeed(seed);
        defineConstant("scaling_factor", 10);

        // ----- TIME PARAMETERS -----

    defineConstant("start_splitAnc", 10);
    defineConstant("Split_Sagit_gen", start_splitAnc + Split_Sagit_delay - today_delay);
    defineConstant("Split_Nemo_gen", start_splitAnc + Split_Nemo_delay - today_delay);

    if (Split_Nemo_gen >= Split_Sagit_gen) {
       defineConstant("today_gen", Split_Nemo_gen + today_delay);
       defineConstant("start_recGeneFlow", Split_Nemo_gen + 1);
    }

    else {
       defineConstant("today_gen", Split_Sagit_gen + today_delay);
       defineConstant("start_recGeneFlow", Split_Sagit_gen + 1);
        }

    } // end defining constants

    // ----- Ancient meta pops -----
    defineConstant("COUNTAnc", 5);
        defineConstant("start", clock());

    // ----- TREE-SEQUENCE RECORDING ----- 
    initializeTreeSeq(); 

    // set the overall mutation rate map
    initializeMutationRate(0);

    initializeMutationType("m1", 0.5, "f", 0.0); // Neutral mutations
    initializeGenomicElementType("g1", m1, 1.0);
    initializeGenomicElement(g1, 0, 1000000);

    //constant recombination along the chromosome:
    initializeRecombinationRate((5.56*1e-8)*10);
}

1 early() {
    // Create the ancestral population

    //set metapop ancestral    
    sim.addSubpop("p1", NAnc);

    // add selfing rate 
    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

start_splitAnc early() {

    sim.addSubpopSplit("p2", NAncS, p1);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p3", NAncN, p1);

    p2.setMigrationRates(p3, mAncNS);
    p3.setMigrationRates(p2, mAncNS);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

}

// ---- EXTINCTION ANCESTRAL POP ----

start_splitAnc+1 early() {
    //remove ancestral pop
    p1.setSubpopulationSize(0);
}

Split_Sagit_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p4", NsagAllo, p2);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p5", NsagSym, p2);

    p4.setMigrationRates(p5, mS);
    p5.setMigrationRates(p4, mS);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);

}

Split_Nemo_gen early() {

    //split from the ancestral pop
    sim.addSubpopSplit("p6", NNemoAllo, p3);

    //split another pop from the ancestral pop
    sim.addSubpopSplit("p7", NNemoSym, p3);

    p6.setMigrationRates(p7, mN);
    p7.setMigrationRates(p6, mN);

    for (sp in sim.subpopulations)
        sp.setSelfingRate(SelfRate);
}

start_recGeneFlow early() {
    // Set migration rates for sympatric
    p2.setSubpopulationSize(0);
    p3.setSubpopulationSize(0);

}

today_gen late() {

    // Create unique output directory for each replicate and seed
    output_dir = path + "/recentMigmodel_onepop";
    if (!fileExists(output_dir)) {
        createDirectory(output_dir);
    }

    cat("Elapsed: " + (clock() - start));
    sim.treeSeqOutput(output_dir + "/param_set_" + param_set_id + "_seed_" + seed + "_replicate_" + replicate + "_onepop" + "_recentMigmodel" + "overlay.trees");

}
petrelharp commented 3 months ago

Okay - something odd is going on that makes the name of that population dissappear from population metadata. However, there's good news for you, @Luker121 - although there is more than one population in the tree sequence, since you began your SLiM simulation with only a single population (p1), you don't need to mess around with any of that (if I understand what you're doing, anyhow).

So - from your SLiM script I'm guessing that you actually only want a history-before-SLiM to be just a single population, which is p1 - is that right? If so, your original recapitation script should work like this:

demography = msprime.Demography.from_tree_sequence(orig_ts)
for pop in demography.populations:
    # must set their effective population sizes
    pop.initial_size = 3831

rts = pyslim.recapitate(
        orig_ts, demography=demography,
        recombination_rate=5.56e-8,
        random_seed=4
)

Or, if you actually do want migration between some populations back then, you can do it - just use poulation indexes instead of names - so, for instance,

demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source=1, dest=2
    )

instead of

demography.add_migration_rate_change(
        time=orig_ts.metadata['SLiM']['tick'], rate=1.21635e-07, source="p1", dest="p2"
    )

However, note that all these other populations you've added didn't exist at the start of the SLiM simulation, so it's a bit weird to recapitate using them.

I'll follow up in another issue about the weirdness here.

Luker121 commented 3 months ago

Ok, thanks a lot for the investigation. Indeed it seems a little bit weird. And your suggestion by using indices instead of names is exactly what I was looking for, as I have several populations in my original code. Thanks a lot :).

petrelharp commented 3 months ago

Great! I'll close this, but please re-open if it's not done. (And, the underlying issue here will be dealt with at https://github.com/MesserLab/SLiM/issues/447 .)