apertium / lttoolbox

Finite state compiler, processor and helper tools used by apertium
http://wiki.apertium.org/wiki/Lttoolbox
GNU General Public License v2.0
18 stars 22 forks source link

lt-proc -b leaking memory #120

Closed unhammer closed 3 years ago

unhammer commented 3 years ago

When I run a corpus through nob-nno, lt-proc -b nob-nno.autobil.bin has increasing resident memory usage: 5k lines: 70M RES, 10k lines: 80M RES, 80k lines: 200M RES etc.

From what I can tell, the leak has been there since the beginning of time (ie. at least 2016 as regards lttoolbox), though constants were lower back then.

unhammer commented 3 years ago

Possibly related to the copy-tag-suffix feature (which is the main thing that makes -b different from -g):

$ echo '^ja<ij>$' | valgrind ~/PREFIX/lttoolbox/bin/lt-proc -b nob-nno.autobil.bin 
==730494== Memcheck, a memory error detector
==730494== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==730494== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==730494== Command: /home/unhammer/PREFIX/lttoolbox/bin/lt-proc -b nob-nno.autobil.bin
==730494== 
^ja<ij>/ja<ij>$
==730494== 
==730494== HEAP SUMMARY:
==730494==     in use at exit: 7,826 bytes in 31 blocks
==730494==   total heap usage: 906,647 allocs, 906,616 frees, 26,955,026 bytes allocated
==730494== 
==730494== LEAK SUMMARY:
==730494==    definitely lost: 0 bytes in 0 blocks
==730494==    indirectly lost: 0 bytes in 0 blocks
==730494==      possibly lost: 0 bytes in 0 blocks
==730494==    still reachable: 7,826 bytes in 31 blocks
==730494==         suppressed: 0 bytes in 0 blocks
==730494== Rerun with --leak-check=full to see details of leaked memory
==730494== 
==730494== For lists of detected and suppressed errors, rerun with: -s
==730494== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

$ echo '^ja<ij><copytags>$' | valgrind ~/PREFIX/lttoolbox/bin/lt-proc -b nob-nno.autobil.bin 
==730546== Memcheck, a memory error detector
==730546== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==730546== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==730546== Command: /home/unhammer/PREFIX/lttoolbox/bin/lt-proc -b nob-nno.autobil.bin
==730546== 
^ja<ij><copytags>/ja<ij><copytags>$
==730546== 
==730546== HEAP SUMMARY:
==730546==     in use at exit: 7,914 bytes in 33 blocks
==730546==   total heap usage: 906,656 allocs, 906,623 frees, 26,955,352 bytes allocated
==730546== 
==730546== LEAK SUMMARY:
==730546==    definitely lost: 24 bytes in 1 blocks
==730546==    indirectly lost: 64 bytes in 1 blocks
==730546==      possibly lost: 0 bytes in 0 blocks
==730546==    still reachable: 7,826 bytes in 31 blocks
==730546==         suppressed: 0 bytes in 0 blocks
==730546== Rerun with --leak-check=full to see details of leaked memory
==730546== 
==730546== For lists of detected and suppressed errors, rerun with: -s
==730546== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)