rkd77 / elinks

Fork of elinks
Other
348 stars 38 forks source link

sgml-parser issues #235

Open daniloegea opened 1 year ago

daniloegea commented 1 year ago

Hi,

I was investigating why the unit tests were failing on Debian/Ubuntu on the s390x and notice that sgml-parser is segfaulting. Although they appear to run fine on other architectures.

I suspected it could be some memory bug that was only showing up on s390x and tried to run the tests on amd64 with address sanitizer enabled. It, in fact, caught a number of user-after-free (which I confirmed with Valgrind) and heap overflows (which could be false positives I guess).

I'm building it with:

meson setup build -Dtest=true -Dlzma=true -Dopenssl=false -Dx=false -Dfinger=true -Dgnutls=true -Dnntp=true -D88-colors=true -D256-colors=true -Dgettext=true -Dnls=true -Dleds=true -Dhtml-highlight=true -Dfsp=true -Dsmb=false -Dperl=true -Dguile=false -Dpython=false -Druby=false -Dgssapi=true -Dcgi=true -Dexmode=true -Dbittorrent=true -Ddebug=true -Dcombining=true -Dterminfo=true -Dlibev=true -Dreproducible=true -Dbrotli=true -Dgopher=true -Dgemini=true -Dsource-date-epoch=$(date "+%s") -Db_sanitize=address

meson compile -C build

meson test -C build -v

While the tests are running, you'll notice a number of AddressSanitizer errors.

If you replace the sgml-parser call in src/dom/test/test-* with valgrind sgml-parser you will see a number of invalid reads with the tracking of where the memory was allocated and freed.

Build log from Debian: https://buildd.debian.org/status/fetch.php?pkg=elinks&arch=s390x&ver=0.16.1.1-2&stamp=1685912428&raw=0

rkd77 commented 1 year ago

valgrind --exit-on-first-error=yes --error-exitcode=1 shows one place where it fails, but I don't understand this code yet, to be able to modify it and repair this issue

daniloegea commented 1 year ago

Thanks for confirming. I'll add an instance of what I got from Valgrind here just for reference:

==497929== Memcheck, a memory error detector
==497929== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==497929== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==497929== Command: sgml-parser --uri test:parse-elements- --stdin 2
==497929==
==497929== Invalid read of size 8
==497929==    at 0x12D869: init_dom_scanner (scanner.c:167)
==497929==    by 0x130584: sgml_parsing_push (parser.c:509)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Address 0x4a86870 is 16 bytes inside a block of size 64 free'd
==497929==    at 0x484620F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13B01B: mem_free (memory.c:85)
==497929==    by 0x129AD2: done_dom_node_data (node.c:463)
==497929==    by 0x129BED: done_dom_node (node.c:494)
==497929==    by 0x12CC10: pop_dom_node (stack.c:235)
==497929==    by 0x130555: sgml_parsing_push (parser.c:505)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Block was alloc'd at
==497929==    at 0x4848A13: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13AF9F: mem_calloc (memory.c:71)
==497929==    by 0x129665: init_dom_node_at (node.c:364)
==497929==    by 0x130379: parse_sgml (parser.c:443)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==
==497929== Invalid read of size 8
==497929==    at 0x12D878: init_dom_scanner (scanner.c:168)
==497929==    by 0x130584: sgml_parsing_push (parser.c:509)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Address 0x4a86870 is 16 bytes inside a block of size 64 free'd
==497929==    at 0x484620F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13B01B: mem_free (memory.c:85)
==497929==    by 0x129AD2: done_dom_node_data (node.c:463)
==497929==    by 0x129BED: done_dom_node (node.c:494)
==497929==    by 0x12CC10: pop_dom_node (stack.c:235)
==497929==    by 0x130555: sgml_parsing_push (parser.c:505)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Block was alloc'd at
==497929==    at 0x4848A13: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13AF9F: mem_calloc (memory.c:71)
==497929==    by 0x129665: init_dom_node_at (node.c:364)
==497929==    by 0x130379: parse_sgml (parser.c:443)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==
==497929== Invalid read of size 8
==497929==    at 0x12D888: init_dom_scanner (scanner.c:169)
==497929==    by 0x130584: sgml_parsing_push (parser.c:509)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Address 0x4a86870 is 16 bytes inside a block of size 64 free'd
==497929==    at 0x484620F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13B01B: mem_free (memory.c:85)
==497929==    by 0x129AD2: done_dom_node_data (node.c:463)
==497929==    by 0x129BED: done_dom_node (node.c:494)
==497929==    by 0x12CC10: pop_dom_node (stack.c:235)
==497929==    by 0x130555: sgml_parsing_push (parser.c:505)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Block was alloc'd at
==497929==    at 0x4848A13: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13AF9F: mem_calloc (memory.c:71)
==497929==    by 0x129665: init_dom_node_at (node.c:364)
==497929==    by 0x130379: parse_sgml (parser.c:443)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==
==497929== Invalid read of size 4
==497929==    at 0x12D890: init_dom_scanner (scanner.c:169)
==497929==    by 0x130584: sgml_parsing_push (parser.c:509)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Address 0x4a86868 is 8 bytes inside a block of size 64 free'd
==497929==    at 0x484620F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13B01B: mem_free (memory.c:85)
==497929==    by 0x129AD2: done_dom_node_data (node.c:463)
==497929==    by 0x129BED: done_dom_node (node.c:494)
==497929==    by 0x12CC10: pop_dom_node (stack.c:235)
==497929==    by 0x130555: sgml_parsing_push (parser.c:505)
==497929==    by 0x12C8E5: call_dom_stack_callbacks (stack.c:162)
==497929==    by 0x12CB26: push_dom_node (stack.c:213)
==497929==    by 0x13039B: parse_sgml (parser.c:444)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==  Block was alloc'd at
==497929==    at 0x4848A13: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==497929==    by 0x13AF9F: mem_calloc (memory.c:71)
==497929==    by 0x129665: init_dom_node_at (node.c:364)
==497929==    by 0x130379: parse_sgml (parser.c:443)
==497929==    by 0x12784C: main (sgml-parser.c:342)
==497929==
==497929==
==497929== HEAP SUMMARY:
==497929==     in use at exit: 0 bytes in 0 blocks
==497929==   total heap usage: 121 allocs, 121 frees, 30,079 bytes allocated
==497929==
==497929== All heap blocks were freed -- no leaks are possible
==497929==
==497929== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)