tskit-dev / tskit

Population-scale genomics
MIT License
153 stars 72 forks source link

ibd_segments crashes sometimes #1739

Closed gtsambos closed 3 years ago

gtsambos commented 3 years ago

I got this error message

Python(86357,0x1171a15c0) malloc: Incorrect checksum for freed object 0x7fa5a028a800: probably modified after being freed.
Corrupt value: 0x41117b6000000000
Python(86357,0x1171a15c0) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6

when I ran the following code:

ts = msprime.sim_mutations(ts, rate=1.65e-8)
ibd = ts.find_ibd(min_length=min_length, max_time=max_time)

This code used to work fine, so I assume it's happened because of the recent changes. I'm using the development version of tskit at this commit.

gtsambos commented 3 years ago

Hmm, actually, this doesn't seem to be just a mutation-related issue. I'm seeing it on tree sequences without mutations now too. It appears to happen more frequently in tree sequences with more samples.

gtsambos commented 3 years ago
Bug detected in lib/tskit/tables.c at line 7438. If you are using tskit directly please open an issue on GitHub, ideally with a reproducible example. (https://github.com/tskit-dev/tskit/issues) If you are using software that uses tskit, please report an issue to that software's issue tracker, at least initially.
Abort trap: 6
gtsambos commented 3 years ago

^I'm occasionally getting this bug too. Might be a different problem though.

gtsambos commented 3 years ago

Here's a reproducible example:

>>> import tskit, msprime
>>> tskit.__version__
'0.3.8.dev1'
>>> msprime.__version__
'1.0.1'
>>> ts = msprime.sim_ancestry(samples=100, sequence_length=1e7,
...  discrete_genome=True, recombination_rate=1e-8, population_size=500,
...  model='dtwf', random_seed=2022)
>>> ts.find_ibd(min_length=100000) 
Python(91745,0x11b6f55c0) malloc: Incorrect checksum for freed object 0x7fdc0e1ace00: probably modified after being freed.
Corrupt value: 0x4110677c00000000
Python(91745,0x11b6f55c0) malloc: *** set a breakpoint in malloc_error_break to debug
Abort trap: 6
benjeffery commented 3 years ago

I'll look at this and try to recreate.

jeromekelleher commented 3 years ago

Thanks @benjeffery - I just chatted to @gtsambos and I'm going to look at it now as my top priority

jeromekelleher commented 3 years ago

Excellent - I can reproduce on linux - I'll report back once I know what's going on.

gtsambos commented 3 years ago

Phew! @benjeffery, I'm having some strange things happen when I try to update to the most recent dev version of tskit -- I'll message you on Slack about it

gtsambos commented 3 years ago

This may be redundant now, but I can confirm that the same behaviour occurs using the code at the current head of the main branch

jeromekelleher commented 3 years ago

I think the bug assert must be a different manifestation of the memory corruption (this is where it gets triggered), so I'm going to assume that #1740 fixes that as well.