Closed bodkan closed 6 months ago
My first instinct was that I made a mistake in rescheduling script blocks in cases of models which have gene flow running all the way to the very end of the simulation. But this minimal model does not have the issue despite featuring gene flow active until "the present" (time 0, because it uses backward time unit specification to match Isabel's Gutenkunst model):
library(slendr)
init_env()
p1 <- population("p1", time = 1000, N = 100)
p2 <- population("p2", time = 500, N = 100, parent = p1)
gf <- gene_flow(from = p1, to = p2, start = 300, end = 0, rate = 0.1)
model <- compile_model(list(p1, p2), gene_flow = gf, generation_time = 1)
plot_model(model)
... followed by this:
> ts <- slim(model, sequence_length = 1000, recombination_rate = 0, verbose = TRUE)
Tree-sequence recording is on but no sampling schedule was given. Generating one for all individuals surviving to the end of the simulation.
--------------------------------------------------
SLiM command to be executed:
slim \
-d 'SAMPLES="/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f096f7ddcd7"' \
-d 'MODEL="/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f0942c321c"' \
-d 'OUTPUT_TS="/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f092bd4105b.trees"' \
-d SPATIAL=F \
-d SEQUENCE_LENGTH=1000 \
-d RECOMB_RATE=0 \
-d BURNIN_LENGTH=0 \
-d SIMULATION_LENGTH=1000 \
-d 'OUTPUT_LOCATIONS=""' \
-d COALESCENT_ONLY=T \
-d MAX_ATTEMPTS=1 \
/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f0942c321c/script.slim 2>&1
--------------------------------------------------
// Initial random seed:
2008322296169
// RunInitializeCallbacks():
SEED: 2008322296169
initializeSLiMOptions(keepPedigrees = T);
initializeTreeSeq();
initializeMutationType(0, 0.5, "f", 0);
initializeGenomicElementType(1, m0, 1);
initializeGenomicElement(g1, 0, 999);
initializeMutationRate(0);
initializeRecombinationRate(0);
// Starting run at tick <start>:
1
Generation 1: starting the simulation
Generation 1: creating p1(p0)
Generation 501: split of p2(p1) from p1(p0)
Generation 701: geneflow p1(p0) -> p2(p1) (0.000333333% over 300 generations)
Generation 1001: geneflow p1(p0) -> p2(p1) set to 0%
Generation 1001: sampling all (100) individuals of p1(p0)
Generation 1001: sampling all (100) individuals of p2(p1)
Generation 1001: saving the tree sequence output to '/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f092bd4105b.trees'
Generation 1001: simulation finished
Tree sequence was saved to:
/var/folders/d_/hblb15pd3b94rg0v35920wd80000gn/T//RtmpfmDznJ/file7f092bd4105b.trees
Loading the tree-sequence file...
[Note that the Generation 1001: geneflow p1(p0) -> p2(p1) set to 0%
now happens before saving the tree sequence and executing sim.simulationFinished()
.]
So there's something else I'm missing.
For the time being (and for the purpose of her Master thesis), @IsabelMarleen implemented the fix described above, grepl()
-ing for "simulation finished"
in the log output fetched by the slim()
function from the SLiM binary running in the background.
Still, it would be good to see just where is my assumption about the script block execution/scheduling off by nailing down the minimum possible reproducible example for this.
Still, it would be good to see just where is my assumption about the script block execution/scheduling off by nailing down the minimum possible reproducible example for this.
IIRC, when you reschedule a script block the order that it runs in, within a tick, goes to the end (as if that script block were defined at the end of the file). See section 25.11 for details, I think. So the order in which you reschedule things will determine the order in which they run. Probably that is what is biting you.
Thanks @bhaller! I didn't want to ping you before I found the smallest possible reproducible example, to simplify digging for the root of the problem in my SLiM code... I appreciate the hint in advance!
when you reschedule a script block the order that it runs in, within a tick, goes to the end (as if that script block were defined at the end of the file). See section 25.11 for details, I think. So the order in which you reschedule things will determine the order in which they run. Probably that is what is biting you.
I suspected something like this could be an issue. I will check the manual carefully later this week (I opened the issue so that @IsabelMarleen is not afraid to post more code in case she finds other, smaller, breaking examples :)).
As I mentioned above, because her models do simulate the correct tree sequence, I don't think this is a breaking bug. It seems to be more of an aesthetics thing (I "like" having the final log output be the "simulation finished"). Then again, when something does look weird or ugly, it can often mean that there is actually a real issue of some kind which could manifest in some other situation...
Always best to understand a bug, even if you then decide not to expend the effort to fix it.
@IsabelMarleen has encountered an interesting curiosity while implementing a slendr importer from Demes. When she imports the Gutenkunst et al. example from the Demes repository, successfully generating a slendr model object, she runs into an interesting issue when she simulates it with our built-in
slim()
function.Apparently it turned out to be a challenge to produce a minimum reproducible example, but she provided a serialized, compiled slendr model that reproduces this: gutenkunst_slendr_model.zip. Briefly, loading her serialized model first:
It seems that Isabel's demes-to-slendr importing code works as it should (let's ignore the poor plotting of slendr for now):
However, running the model with
slim()
gives us:The reason for the error is because I'm currently checking for the success of the simulation by making sure the last line of the slendr/SLiM log output contains
"simulation finished"
. Not the most elaborate way to do this, but it's served me for a year now...Now, the above is not really a problem per-se -- after all, the simulation does finish running successfully, and it does produce a valid tree sequence. I could simply check for
"simulation finished"
anywhere in the log output and not just at the very end.Still, I'm surprised that the gene-flow callback function log outputs are printed after the call to
sim.simulationFinished()
is executed -- although they are all scheduled to happen at the end of the last generation (well, "tick", but you get my meaning), I assumed they would be executed in top-to-bottom order (gene-flow events are executed in callbacks s2 & s3, termination happens in callback s12 below.