Closed nijel closed 1 year ago
The memory error is expected in this case as it is allocating (roughly) 200000 * 200000 * 4
bytes. I can not reproduce the segfault on my (Debian) system. @scarabeusiv maybe there is something wrong in SUSE Python build?
I don't see anything problematic in related code:
Looking at the code it indeed looks fine. The traceback for the crash:
Program received signal SIGSEGV, Segmentation fault.
damerau_levenshtein_distance (s1=0xf74d5010 L'a' <repeats 200 times>..., s2=0xf7411010 L'b' <repeats 200 times>..., len1=200000, len2=200000) at cjellyfish/damerau_levenshtein.c:35
35 cjellyfish/damerau_levenshtein.c: No such file or directory.
(gdb) bt
#0 damerau_levenshtein_distance (s1=0xf74d5010 L'a' <repeats 200 times>..., s2=0xf7411010 L'b' <repeats 200 times>..., len1=200000, len2=200000) at cjellyfish/damerau_levenshtein.c:35
#1 0xf77037cb in jellyfish_damerau_levenshtein_distance (self=0xf77acacc, args=0xf779cdac) at cjellyfish/jellyfishmodule.c:153
#2 0xf7d19ef5 in _PyMethodDef_RawFastCallKeywords (method=0xf770a160 <jellyfish_methods+64>, self=0xf77acacc, args=0xf77dd478, nargs=2, kwnames=0x0) at Objects/call.c:694
#3 0xf7d19da6 in _PyCFunction_FastCallKeywords (func=0xf77accfc, args=0xf77dd478, nargs=2, kwnames=0x0) at Objects/call.c:730
#4 0xf7daffef in call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>) at Python/ceval.c:4568
#5 _PyEval_EvalFrameDefault (f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3124
#6 0xf7daad9c in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=0xf7867734, locals=0xf7867734, args=0x0, argcount=0, kwnames=0x0, kwargs=0x0, kwcount=<optimized out>, kwstep=2, defs=0x0, defcount=0,
kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0) at Python/ceval.c:3930
#7 0xf7daabbd in PyEval_EvalCodeEx (_co=0xf77ecee8, globals=0xf7867734, locals=0xf7867734, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at Python/ceval.c:3959
#8 0xf7daab50 in PyEval_EvalCode (co=0xf77ecee8, globals=0xf7867734, locals=0xf7867734) at Python/ceval.c:524
#9 0xf7e3daad in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0xf7867734, locals=0xf7867734, flags=0xffffd8c8, arena=0xf7906c80) at Python/pythonrun.c:1035
#10 0xf7cdb8ad in PyRun_InteractiveOneObjectEx (fp=fp@entry=0xf7c2a580 <_IO_2_1_stdin_>, filename=filename@entry=0xf784a7c0, flags=flags@entry=0xffffd8c8) at Python/pythonrun.c:256
#11 0xf7cdb526 in PyRun_InteractiveLoopFlags (fp=0xf7c2a580 <_IO_2_1_stdin_>, filename_str=0xf7e959be "<stdin>", flags=0xffffd8c8) at Python/pythonrun.c:120
#12 0xf7cdb238 in PyRun_AnyFileExFlags (fp=0xf7f66000, filename=0xf7e959be "<stdin>", closeit=-134454456, flags=0xf7e452e8 <pymain_main+888>) at Python/pythonrun.c:78
#13 0x00000001 in ?? ()
#14 0xf7f66000 in ?? () from /usr/lib/libpython3.7m.so.1.0
#15 0xffffd8f4 in ?? ()
#16 0x00000000 in ?? ()
@mcepl any ideas if we do something specific to 32bit on py3? I quickly checked and it does not seem so.
Looking again at the code, I think the problem might be in overflow in the (len1 + 2) * cols * sizeof(size_t)
expression. The allocation then doesn't fail and further code is executed with smaller buffer than it expects leading to crash.
Valgrind output with >1GB allocation attempt:
>>> print(damerau_levenshtein_distance('a' * 200000, 'b' * 200000))
==4532== Warning: set address range perms: large range [0x5a18028, 0x46909438) (undefined)
==4532== Invalid write of size 4
==4532== at 0x590A500: damerau_levenshtein_distance (damerau_levenshtein.c:36)
==4532== by 0x59097CA: jellyfish_damerau_levenshtein_distance (jellyfishmodule.c:153)
==4532== by 0x4915EF4: _PyMethodDef_RawFastCallKeywords (call.c:694)
==4532== by 0x4915DA5: _PyCFunction_FastCallKeywords (call.c:730)
==4532== by 0x49ABFEE: call_function (ceval.c:4568)
==4532== by 0x49ABFEE: _PyEval_EvalFrameDefault (ceval.c:3124)
==4532== by 0x49A6D9B: _PyEval_EvalCodeWithName (ceval.c:3930)
==4532== by 0x49A6BBC: PyEval_EvalCodeEx (ceval.c:3959)
==4532== by 0x49A6B4F: PyEval_EvalCode (ceval.c:524)
==4532== by 0x4A39AAC: run_mod (pythonrun.c:1035)
==4532== by 0x48D78AC: PyRun_InteractiveOneObjectEx (pythonrun.c:256)
==4532== by 0x48D7525: PyRun_InteractiveLoopFlags (pythonrun.c:120)
==4532== by 0x48D7237: PyRun_AnyFileExFlags.cold.10 (pythonrun.c:78)
==4532== Address 0x4693a4bc is not stack'd, malloc'd or (recently) free'd
==4532==
==4532==
==4532== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==4532== Access not within mapped region at address 0x4693A4BC
==4532== at 0x590A500: damerau_levenshtein_distance (damerau_levenshtein.c:36)
==4532== by 0x59097CA: jellyfish_damerau_levenshtein_distance (jellyfishmodule.c:153)
==4532== by 0x4915EF4: _PyMethodDef_RawFastCallKeywords (call.c:694)
==4532== by 0x4915DA5: _PyCFunction_FastCallKeywords (call.c:730)
==4532== by 0x49ABFEE: call_function (ceval.c:4568)
==4532== by 0x49ABFEE: _PyEval_EvalFrameDefault (ceval.c:3124)
==4532== by 0x49A6D9B: _PyEval_EvalCodeWithName (ceval.c:3930)
==4532== by 0x49A6BBC: PyEval_EvalCodeEx (ceval.c:3959)
==4532== by 0x49A6B4F: PyEval_EvalCode (ceval.c:524)
==4532== by 0x4A39AAC: run_mod (pythonrun.c:1035)
==4532== by 0x48D78AC: PyRun_InteractiveOneObjectEx (pythonrun.c:256)
==4532== by 0x48D7525: PyRun_InteractiveLoopFlags (pythonrun.c:120)
==4532== by 0x48D7237: PyRun_AnyFileExFlags.cold.10 (pythonrun.c:78)
==4532== If you believe this happened as a result of a stack
==4532== overflow in your program's main thread (unlikely but
==4532== possible), you can try to increase the size of the
==4532== main thread stack using the --main-stacksize= flag.
==4532== The main thread stack size used in this run was 8388608.
==4532==
==4532== HEAP SUMMARY:
==4532== in use at exit: 18,695,879 bytes in 1,024 blocks
==4532== total heap usage: 13,188 allocs, 12,164 frees, 1,095,605,104 bytes allocated
==4532==
==4532== LEAK SUMMARY:
==4532== definitely lost: 37,008 bytes in 18 blocks
==4532== indirectly lost: 0 bytes in 0 blocks
==4532== possibly lost: 0 bytes in 0 blocks
==4532== still reachable: 18,658,871 bytes in 1,006 blocks
==4532== suppressed: 0 bytes in 0 blocks
==4532== Rerun with --leak-check=full to see details of leaked memory
==4532==
==4532== For counts of detected and suppressed errors, rerun with: -v
==4532== Use --track-origins=yes to see where uninitialised values come from
==4532== ERROR SUMMARY: 558 errors from 73 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
The
damerau_levenshtein_distance
function segfaults with long strings on 32-bit platform. This was originally reported at https://github.com/WeblateOrg/weblate/issues/2673 by @scarabeusiv, but the actual segfault happens in jellyfish:Also if I use the C extension instead it reports memory error: