eaton-lab / tetrad

Phylogenetic inference using phylogenetic invariants and quartet joining
GNU General Public License v3.0
0 stars 2 forks source link

tetrad crashy behavior #3

Open isaacovercast opened 4 years ago

isaacovercast commented 4 years ago

Runs for a while and then crashes:

[                    ]   0% 0:00:00 | full tree *  *** Error in `/home/isaac/miniconda2/envs/momi-py36/bin/python': malloc(): smallbin double linked list corrupted: 0x0000555cd3108bb0 ***

This investigation was spurred by an email from Nathan Weeks:

I have been assisting Austin Garner (CC'ed) with running Tetrad on our HPC cluster. We have noticed repeated crashes using his variant HDF5 file; setting the NUMBA_DEVELOPER_MODE=1 and NUMBA_BOUNDSCHECK=1 reveals an out-of-bounds array access error, suggesting a possible tetrad application bug:

export NUMBA_DEVELOPER_MODE=1 NUMBA_BOUNDSCHECK=1
ipcluster start --debug --n=1 &
sleep 10

tetrad -i seq_group_filtered_phylogeny_CT93_MS15.snps.hdf5 -o seq_group_filtered_phylogeny_CT93_MS15.snps.tetrad -q 1000000 -b 10 -f --ipcluster=default

Whereas he reports this error:

2020-05-15 20:37:25.188 [IPClusterStart] debug: IndexError: index 550 is out of bounds for axis 1 with size 16
2020-05-15 20:37:25.231 [IPClusterStart] Process '/opt/conda/bin/python' stopped: {'exit_code': 0, 'pid': 350740}

Encountered an Error.
Message: IndexError: index is out of bounds
Parallel connection closed.
---------------------------------------------------------------------------IndexError                                Traceback (most recent call last)<string> in <module>
/opt/conda/lib/python3.7/site-packages/tetrad/worker.py in nworker(tet, chunk)
    104             # here are the jitted funcs
    105             if nsnps[idx]:
--> 106                 bidx, invar = calculate(seqs, maparr, nmask, TESTS)
    107             else:
    108                 bidx = TESTS[np.random.randint(3)]
IndexError: index is out of bounds

Feels like numba voodoo.

isaacovercast commented 4 years ago

This was with Numba 0.46.0. I updated to Numba 0.49.1 and now I do see the error Nathan reported:

2020-05-19 08:58:59.995 [IPClusterStart] Process '/home/isaac/miniconda2/envs/momi-py36/bin/python' stopped: {'exit_code': 0, 'pid': 10217}
---------------------------------------------------------------------------IndexError                                Traceback (most recent call last)<string> in <module>()
~/tmp/tetrad/tetrad/worker.py in nworker(tet, chunk)
    104             # here are the jitted funcs
    105             if nsnps[idx]:
--> 106                 bidx, invar = calculate(seqs, maparr, nmask, TESTS)
    107             else:
    108                 bidx = TESTS[np.random.randint(3)]
IndexError: index is out of bounds