jmschrei / pomegranate

Fast, flexible and easy to use probabilistic modelling in Python.
http://pomegranate.readthedocs.org/en/latest/
MIT License
3.29k stars 591 forks source link

Fixed a heap overflow in `viterbi` path backtrace calculations #990

Closed jonn-smith closed 1 year ago

jonn-smith commented 1 year ago

In the code for the Viterbi path calculation, there was a heap overflow in the path variable as the final path is being traced back (hmm.pyx:2194). This seems to only happen with some extremely long and repetitive sequences in my model.

Valgrind reports it as follows:

...
==15372== Invalid write of size 4
==15372==    at 0x1FB5844E: __pyx_f_11pomegranate_3hmm_17HiddenMarkovModel__viterbi (hmm.c:25865)
==15372==    by 0x1FBB8F05: __pyx_f_11pomegranate_3hmm_17HiddenMarkovModel_viterbi (hmm.c:24653)
==15372==    by 0x1FBBA254: __pyx_pf_11pomegranate_3hmm_17HiddenMarkovModel_46viterbi (hmm.c:24808)
==15372==    by 0x1FBBA2D4: __pyx_pw_11pomegranate_3hmm_17HiddenMarkovModel_47viterbi (hmm.c:24792)
==15372==    by 0x177023: _PyMethodDef_RawFastCallKeywords (call.c:647)
==15372==    by 0x311F8E: _PyMethodDescr_FastCallKeywords (descrobject.c:288)
==15372==    by 0x23786F: call_function (ceval.c:4593)
==15372==    by 0x23786F: _PyEval_EvalFrameDefault (ceval.c:3110)
==15372==    by 0x22B831: PyEval_EvalFrameEx (ceval.c:547)
==15372==    by 0x175F9B: function_code_fastcall (call.c:283)
==15372==    by 0x176AC6: _PyFunction_FastCallKeywords (call.c:408)
==15372==    by 0x237936: call_function (ceval.c:4616)
==15372==    by 0x237936: _PyEval_EvalFrameDefault (ceval.c:3110)
==15372==    by 0x22B831: PyEval_EvalFrameEx (ceval.c:547)
==15372==  Address 0x30e00894 is 0 bytes after a block of size 44,996 alloc'd
==15372==    at 0x4C33B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15372==    by 0x1FBB8CEA: __pyx_f_11pomegranate_3hmm_17HiddenMarkovModel_viterbi (hmm.c:24565)
==15372==    by 0x1FBBA254: __pyx_pf_11pomegranate_3hmm_17HiddenMarkovModel_46viterbi (hmm.c:24808)
==15372==    by 0x1FBBA2D4: __pyx_pw_11pomegranate_3hmm_17HiddenMarkovModel_47viterbi (hmm.c:24792)
==15372==    by 0x177023: _PyMethodDef_RawFastCallKeywords (call.c:647)
==15372==    by 0x311F8E: _PyMethodDescr_FastCallKeywords (descrobject.c:288)
==15372==    by 0x23786F: call_function (ceval.c:4593)
==15372==    by 0x23786F: _PyEval_EvalFrameDefault (ceval.c:3110)
==15372==    by 0x22B831: PyEval_EvalFrameEx (ceval.c:547)
==15372==    by 0x175F9B: function_code_fastcall (call.c:283)
==15372==    by 0x176AC6: _PyFunction_FastCallKeywords (call.c:408)
==15372==    by 0x237936: call_function (ceval.c:4616)
==15372==    by 0x237936: _PyEval_EvalFrameDefault (ceval.c:3110)
==15372==    by 0x22B831: PyEval_EvalFrameEx (ceval.c:547)
==15372==
...

Context in  hmm.c:25865:

...
25858       /* "pomegranate/hmm.pyx":2194
25859  *          # Put the position in the path, making sure to look up the state
25860  *          # object to use instead of the state index.
25861  *          path[length] = py             # <<<<<<<<<<<<<<
25862  *          length += 1
25863  *
25864  */
25865       (__pyx_v_path[__pyx_v_length]) = __pyx_v_py;
...

This PR introduces a small multiplier in the memory allocation for path. This heuristic fixes the issue and Valgrind no longer reports the error (see attached before and after valgrind logs). valgrind.jts_hmm_patch.log valgrind.v0.14.8.log

This PR also renames a few variables for readability and has an updated the signature for _viterbi to include the length of the path variable (in ints).

I ran valgrind with the following options after recompiling python in debug mode and following the steps here:

PYTHONMALLOC=malloc_debug PYTHONASYNCIODEBUG=1 valgrind --tool=memcheck --leak-check=full --suppressions=valgrind-python.supp --log-file=valgrind.log python3 -W default -X faulthandler -m MY_MODULE ARGS  

Please let me know what you think.

jonn-smith commented 1 year ago

@jmschrei Do you have thoughts on this? If necessary I can post some data to reproduce the issue that this fixes.

jmschrei commented 1 year ago

Hi @jonn-smith

I'm currently rewriting pomegranate, in it's entirety, in PyTorch. Because of this, I'm not going to be able to merge bug fixes like this. Thanks for taking the time to look into this issue and debug it, though!

jmschrei commented 1 year ago

Thank you for your contribution. However, pomegranate has recently been rewritten from the ground up to use PyTorch instead of Cython, and so this PR is unfortunately out of date and will be closed.