Closed petrelharp closed 1 year ago
Might you have a look at this, @bodkan? Warning: understanding what's going on here requires the painfully-acquired info in this section.
Note: if the roots aren't at the expected time in the first tree, it'll do something like this:
/home/peter/projects/tskit-dev/pyslim/pyslim/methods.py:80: RootTimesMismatchWarning: Not all roots of the provided
tree sequence are at the time expected by recapitate(). This could happen if you've simplified before recapitating,
or if you added new individuals without parents in SLiM during the course of the simulation (e.g., with sim.addSubPop()).
If you wish to suppress this warning, you can use, e.g., warnings.simplefilter('ignore', pyslim.RootTimesMismatchWarning)
Expected root time: 10 Observed root times: [8.0]
warnings.warn(message, RootTimesMismatchWarning)
I'm only checking the first tree since (a) that will always catch the sort of error we have here; and (b) checking the roots of all trees could take a while.
Merging #308 (99b3a40) into main (a47c6fe) will increase coverage by
0.21%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## main #308 +/- ##
==========================================
+ Coverage 95.43% 95.65% +0.21%
==========================================
Files 6 8 +2
Lines 592 621 +29
Branches 127 131 +4
==========================================
+ Hits 565 594 +29
Misses 18 18
Partials 9 9
Impacted Files | Coverage ฮ | |
---|---|---|
pyslim/methods.py | 97.47% <100.00%> (+0.07%) |
:arrow_up: |
... and 2 files with indirect coverage changes
:mega: Weโre building smart automated test selection to slash your CI/CD build times. Learn more
Here's the info on when exactly this bug happens: format is {type of model}_{init stage}_{save stage}
:
nonWF_early_early.trees: metadata=10 root=9.0
nonWF_early_first.trees: metadata=10 root=8.0
nonWF_early_late.trees: metadata=10 root=9.0
nonWF_first_early.trees: metadata=10 root=10.0
nonWF_first_first.trees: metadata=10 root=9.0
nonWF_first_late.trees: metadata=10 root=10.0
nonWF_late_early.trees: metadata=10 root=9.0
nonWF_late_first.trees: metadata=10 root=8.0
nonWF_late_late.trees: metadata=10 root=9.0
WF_early_early.trees: metadata=10 root=9.0
WF_early_first.trees: metadata=10 root=9.0
WF_early_late.trees: metadata=10 root=10.0
WF_first_early.trees: metadata=10 root=9.0
WF_first_first.trees: metadata=10 root=9.0
WF_first_late.trees: metadata=10 root=10.0
WF_late_early.trees: metadata=10 root=8.0
WF_late_first.trees: metadata=10 root=8.0
WF_late_late.trees: metadata=10 root=9.0
The takeaway is that we seem to only not have the bug in:
In particular, it does affect sims where the init and emit stages were the same.
The takeaway is that we seem to only not have the bug in:
- nonWF/first/{early,late}
- WF/{first,early}/late
In particular, it does affect sims where the init and emit stages were the same.
Ouch. Since first()
events are not used very often (since they are a pretty new feature), this means that probably the bug bites almost everyone with nonWF models, and bites everyone with WF models except init in early()
, save in late()
.
Might you have a look at this, @bodkan?
Sure thing! I'm happy to take a look and re-run some of the slendr-based models and tests (the ones which I later chiseled down to the early.slim
and late.slim
equivalents), just to make sure that
a) what worked before still works (no reason it shouldn't, I would think),
b) the models that returned broken results earlier are fixed now.
Warning: understanding what's going on here requires the painfully-acquired info in this section.
I actually think it's about time I finally read what is actually going on at the lower levels. ๐ Thanks for the link.
Also: Wow, that was extremely fast. I guess once we verified that the early
and late
versions do indeed give different results you probably knew quite well where to look, but still -- wow! ๐
Another reason we can't just figure out what the time is supposed to be based on metadata (even if the information was in there) - if it's produced by making and annotatingg an msprime tree sequence, it probably won't have the offset we'd expect.
Notes on when/how this should affect people and how they can tell:
I changed this to an error, since if the time is wrong it will create a bottleneck of size 1, which is certainly not what anyone wants.
This feels a little hacky, but I think it might be the best thing it's possible to do, actually, besides requiring that people always initialize and write out (and Remember) in the same stage.