MesserLab / SLiM

SLiM is a genetically explicit forward simulation software package for population genetics and evolutionary biology. It is highly flexible, with a built-in scripting language, and has a cross-platform graphical modeling environment called SLiMgui.
https://messerlab.org/slim/
GNU General Public License v3.0
161 stars 33 forks source link

(TSK_ERR_MUTATION_TIME_YOUNGER_THAN_NODE) #473

Open dylandebaun opened 1 month ago

dylandebaun commented 1 month ago

Hi Ben! Reposting this from the slim-discuss group. Thanks for helping out! I tried on Slim 3.7 (when it was written) and updated the script for the newest version I believe but I get the bug for both versions. I may not have done the update correctly however. Best, Dylan xiphophorus_birchmanni_10x_12Sep2018_yDAA6-ScyDAA6-1196-HRSCAF-1406_coding_nonoverlapping.tsv.txt schumer.slim.txt SLiM-ready-recmap-ScyDAA6-1196-HRSCAF-1406.tsv.txt

Hello,

I'm trying to run the following code from Schumer et al., 2020 (modified for debugging). It fails at a generation 646 (seemingly randomly) with the following error: tsk_table_collection_deduplicate_sites: A mutation's time must be >= the node time, or be marked as 'unknown'. (TSK_ERR_MUTATION_TIME_YOUNGER_THAN_NODE) schumer.slim.txt

I'm not exactly sure what could be happening. Thank you in advance for any assistance!

Best, Dylan

Output: // Starting run at tick :

1

"done1"

"done2"

600

"Mutation m1 at position 19117295 was added at time 600"

"Mutation m1 at position 23743212 was added at time 600"

"Mutation m1 at position 10785214 was added at time 600"

"Mutation m2 at position 31064592 was added at time 600"

"Mutation m2 at position 31064593 was added at time 600"

"Mutation m2 at position 31064594 was added at time 600"

"Mutation at position 19117295 was added at time 600"

"Mutation at position 23743212 was added at time 600"

"Mutation at position 10785214 was added at time 600"

"Mutation at position 31064592 was added at time 600"

"Mutation at position 31064593 was added at time 600"

"Mutation at position 31064594 was added at time 600"

600

601

602

603

604

605

606

607

608

609

610

611

612

613

614

615

616

617

618

619

620

621

622

623

624

625

626

627

628

629

630

631

632

633

634

635

636

637

638

639

640

641

642

643

644

645

646

bhaller commented 1 month ago

Hi @dylandebaun. Interesting issue. @petrelharp, bringing you in on this as well. In a debug build of SLiM, SLiM notices the problem before tskit does, in a log:

Species::RecordNewDerivedState(): invalid derived state recorded in tick 600 genome 34412 id 160000 with time -600 >= -600

emitted by Species::RecordNewDerivedState(). tskit notices the problem the next time we check the integrity of the tree sequence. The precision printed by that log above is not helpful, but the debugger sheds light on the problem:

(lldb) p time
(double) -600.00000999999997
(lldb) p tables_.nodes.time[genomeTSKID]
(double) -600

The variable time is computed in Species::RecordNewDerivedState() as:

double time = -(double) (community_.tree_seq_tick_ + community_.tree_seq_tick_offset_);

and indeed, here are the values it is based upon:

(lldb) p community_.tree_seq_tick_
(slim_tick_t) 600
(lldb) p community_.tree_seq_tick_offset_
(double) 0.000010000000000000001

So we have a small tick offset, and because of that tick offset, there is a timing error in the derived state being recorded. As documented in Population::AddSubpopulationSplit(), the tick offset is there to make the timing of the tree sequence work when subpops are split off:

        // OK, so, this is gross.  The user can call addSubpopSplit(), and the new individuals created as a side effect need to
        // arrive in the tree-seq tables stamped with a later time than their parents, but as far as SLiM is concerned they
        // were created at the same time as the parents; they exist at the same point in time, and it's WF, therefore they
        // were created at the same time, Q.E.D.  So to get the tree-seq code not to object, we add a small offset to the
        // tick counter.  Since addSubpopSplit() could be called multiple times in sequence, splitting a new subpop off
        // from each previous one in a linear fashion, each call to addSubpopSplit() needs to increase this offset slightly.
        // The offset is reset to zero each time the tree sequence's tick counter advances.  This is similar to the timing
        // problem that made us create tree_seq_tick_ in the first place, due to children arriving in the same SLiM tick as
        // their parents, but at least that only happens once per tick in a predictable fashion.

So there has been a call to addSubpopSplit(), which pushed tree_seq_tick_offset_ out a bit so that the split subpopulation arrives later in time (600.00001) than its source subpopulation (600.0)... and then there is a call to addNewMutation(), which I guess is adding a new mutation (or maybe several) to an individual that is in the source subpopulation. So that's illegal; you can't add a mutation to an individual after the point in time when it's born. Normally, that gives a better error message, like:

(Genome_Class::ExecuteMethod_addNewMutation): addNewDrawnMutation() cannot add mutations to individuals of age > 0 when tree-sequence recording is enabled, to prevent internal inconsistencies.

You can see that error in this simple example model, for example:

initialize() { initializeSLiMModelType("nonWF"); initializeTreeSeq(); defineConstant("K", 500); initializeMutationType("m1", 0.5, "f", 0.0); initializeMutationType("m2", 0.5, "f", 0.001); initializeGenomicElementType("g1", m1, 1.0); initializeGenomicElement(g1, 0, 99999); initializeMutationRate(1e-7); initializeRecombinationRate(1e-8); } reproduction() { subpop.addCrossed(individual, subpop.sampleIndividuals(1)); } 1 early() { sim.addSubpop("p1", 1000); } early() { p1.fitnessScaling = K / p1.individualCount; } 100 late() { target = sample(p1.genomes, 100); target.addNewDrawnMutation(m2, 1000); } 2000 late() { }

In tick 100 it draws a random sample of genomes and tries to add a mutation to them, and boom, you get the error since some of those genomes are not newborn. The reason for this restriction is clear: if an individual is no longer newborn, it could have descendants, and adding a mutation to an individual after it has already had descendants would imply that the descendants would also have that mutation... but they don't. SLiM has already created the descendants, and they did not inherit that mutation from their parent, because the mutation was not there. Rather than attempt to retroactively fix up all the mutational state for everybody (an impossible task, in general case, since a whole ancestry tree descending from the original individual could now exist, with any number of other mutational changes having happened at the same site since, etc.), SLiM throws an error.

The unusual thing here is that this state of affairs happens within a single tick, in a WF model, because of the weirdness of tree_seq_tick_offset_ and addSubpopSplit(). SLiM doesn't catch that error (except for a log when running in a debug build), and so it makes it to tskit, which correctly raises an error but not a very user-informative one.

So, @dylandebaun: the upshot is that the model you're trying to work with appears to have a bug in it, where it tries to change the genetics of an individual after it has already had a descendant (due to splitting off a subpopulation). That leaves the tree sequence in an inconsistent state. Apparently that error was not caught in SLiM 3.7; I guess tskit's internal checks maybe did not test for this at that time. Who knows what problems it might lead to downstream in the analysis; maybe it would be inconsequential, maybe not. To fix the problem, you'd need to either add the mutations in question before the subpop is split off (so that the descendants inherit them correctly), or add the mutations in a later tick (in descendants of the original individual, in other words). You can't split and then add mutations to the parents of the split; it violates the rules.

But there is a SLiM bug here, that the error ought to be caught by SLiM, and a more readable error (the same error as my test model above) ought to be produced.

Thanks for the report @dylandebaun! I hope the above makes sense. :-> Feel free to ask questions if anything remains unclear. @petrelharp do you have any thoughts on this, or shall I proceed as planned?

petrelharp commented 1 month ago

Makes sense to me! Good sleuthing.