infoscout / weighted-levenshtein

Weighted Levenshtein library
MIT License
105 stars 26 forks source link

segmentation fault on Linux #28

Open maxbachmann opened 2 years ago

maxbachmann commented 2 years ago

I just tried to run a simple example with weighted-levenshtein on Linux (Fedora 36) with Python3.9 and Python3.10, but I did run into a segmentation fault. I installed the library using:

pip install weighted_levenshtein

and did run the following example:

from weighted_levenshtein import dam_lev
s1 = "aaaa"
s2 = "bbbb"
print(dam_lev(s1, s2))

This leads to the following valgrind output:

==14636== Invalid read of size 8
==14636==    at 0x130665F2: __pyx_f_20weighted_levenshtein_4clev_c_damerau_levenshtein (clev.c:5185)
==14636==    by 0x1307ECFA: __pyx_pf_20weighted_levenshtein_4clev_damerau_levenshtein (clev.c:4740)
==14636==    by 0x1307ECFA: __pyx_pw_20weighted_levenshtein_4clev_1damerau_levenshtein (clev.c:4467)
==14636==    by 0x49832AD: UnknownInlinedFun (abstract.h:114)
==14636==    by 0x49832AD: UnknownInlinedFun (abstract.h:123)
==14636==    by 0x49832AD: UnknownInlinedFun (ceval.c:5891)
==14636==    by 0x49832AD: _PyEval_EvalFrameDefault (ceval.c:4213)
==14636==    by 0x4982092: UnknownInlinedFun (pycore_ceval.h:46)
==14636==    by 0x4982092: _PyEval_Vector (ceval.c:5065)
==14636==    by 0x49FDE83: PyEval_EvalCode (ceval.c:1134)
==14636==    by 0x4A2F2B2: run_eval_code_obj (pythonrun.c:1291)
==14636==    by 0x4A2A7D9: run_mod (pythonrun.c:1312)
==14636==    by 0x48FD1CF: pyrun_file.cold (pythonrun.c:1208)
==14636==    by 0x4A24AD8: _PyRun_SimpleFileObject (pythonrun.c:456)
==14636==    by 0x4A24897: _PyRun_AnyFileObject (pythonrun.c:90)
==14636==    by 0x4A21A4B: UnknownInlinedFun (main.c:353)
==14636==    by 0x4A21A4B: UnknownInlinedFun (main.c:372)
==14636==    by 0x4A21A4B: UnknownInlinedFun (main.c:587)
==14636==    by 0x4A21A4B: Py_RunMain (main.c:666)
==14636==    by 0x49EDF8A: Py_BytesMain (main.c:720)
==14636==  Address 0x12dc8868 is 118,952 bytes inside an unallocated block of size 2,816,032 in arena "client"
maxbachmann commented 2 years ago

On Python3.6 I do not get a segmentation fault:

[max@fedora Workspace]$ python3.6 test.py 
4.0

However there is still an invalid read. It just does not lead to a segmentation fault:

[max@fedora Workspace]$ valgrind python3.6 test.py 
==15034== Invalid read of size 8
==15034==    at 0x12C154C2: __pyx_f_20weighted_levenshtein_4clev_c_damerau_levenshtein (clev.c:3437)
==15034==    by 0x12C26D3C: __pyx_pf_20weighted_levenshtein_4clev_damerau_levenshtein (clev.c:2993)
==15034==    by 0x12C26D3C: __pyx_pw_20weighted_levenshtein_4clev_1damerau_levenshtein (clev.c:2759)
==15034==    by 0x495F10F: _PyCFunction_FastCallDict (methodobject.c:231)
==15034==    by 0x495ECA7: call_function (ceval.c:4851)
==15034==    by 0x495842B: _PyEval_EvalFrameDefault (ceval.c:3335)
==15034==    by 0x4957264: _PyEval_EvalCodeWithName (ceval.c:4166)
==15034==    by 0x4956EF9: PyEval_EvalCodeEx (ceval.c:4187)
==15034==    by 0x49BB5CE: PyEval_EvalCode (ceval.c:731)
==15034==    by 0x49D1CFD: run_mod (pythonrun.c:1025)
==15034==    by 0x49D8158: PyRun_FileExFlags (pythonrun.c:978)
==15034==    by 0x48FB48C: PyRun_SimpleFileExFlags.cold (pythonrun.c:419)
==15034==    by 0x49B40D8: UnknownInlinedFun (main.c:344)
==15034==    by 0x49B40D8: Py_Main (main.c:814)
==15034==  Address 0x12bcbcc8 is 116,568 bytes inside an unallocated block of size 383,600 in arena "client"
==15034== 
4.0