and my understanding is that it's intended that we currently SIGABRT for such cases. But let's track in this issue if we want to change that at some point.
With:
libcon4m 31e34c4eafd8e9e916584b2c116c90ba4459ede9
x86_64
Linux 6.9.8
clang 18.1.8
$ printf '\xc0' > /tmp/invalid.c4m
$ ./dev build && cd build
$ valgrind --track-origins=yes --leak-check=no ./c4test /tmp/invalid.c4m
[...]
Process terminating with default action of signal 6 (SIGABRT): dumping core
at 0x4F35E44: __pthread_kill_implementation (pthread_kill.c:44)
by 0x4EDDA2F: raise (raise.c:26)
by 0x4EC54C2: abort (abort.c:79)
by 0x118DAD: c4m_internal_utf8_set_codepoint_count (../src/con4m/string.c:19)
by 0x11D4FF: utf8_init (../src/con4m/string.c:530)
by 0x1242D3: _c4m_new (../src/con4m/object.c:471)
by 0x143DC2: c4m_stream_bytes_to_output (../src/con4m/streams.c:311)
by 0x143D0B: c4m_stream_raw_read (../src/con4m/streams.c:388)
by 0x143FF7: c4m_stream_read_all (../src/con4m/streams.c:415)
by 0x15E8F1: c4m_lex (../src/con4m/compiler/lex.c:1126)
by 0x116CD8: c4m_extract_kat (../src/tests/test.c:118)
by 0x1169AA: build_file_list (../src/tests/test.c:241)
Aside: changing this might be helpful for fuzzing.
Edit: it looks like, 3 hours before this ticket was opened, commit https://github.com/crashappsec/libcon4m/commit/288e79d4c0fecc2219b0ca6c1d804dc0e4155be4 in the jtv/dev branch changed this behavior. I hadn't seen that commit when I opened this ticket. With that commit, valgrind produces the same complaint for invalid.c4m as it does for other files:
$ valgrind --track-origins=yes --leak-check=no ./c4test /tmp/invalid.c4m
[...]
i = 0
An exception was raised before exit:
Error: Invalid utf8 in string when convering to utf32.
Raised from: ../src/con4m/string.c:599
Warning: set address range perms: large range [0x15596000, 0x25598000) (defined)
Invalid read of size 8
at 0x12585E: scan_range_for_allocs (../src/con4m/collect.c:550)
by 0x125461: raw_trace (../src/con4m/collect.c:638)
by 0x124908: c4m_collect_arena (../src/con4m/collect.c:939)
by 0x12561F: c4m_gc_thread_collect (../src/con4m/collect.c:1055)
by 0x1194C4: main (../src/tests/test.c:736)
Address 0x1ffe801000 is not stack'd, malloc'd or (recently) free'd
Process terminating with default action of signal 11 (SIGSEGV): dumping core
Access not within mapped region at address 0x1FFE801000
So it looks like the future PR for that branch can close this ticket.
The reference doc contains:
https://github.com/crashappsec/libcon4m/blob/31e34c4eafd8e9e916584b2c116c90ba4459ede9/doc/reference.md#L21
and my understanding is that it's intended that we currently SIGABRT for such cases. But let's track in this issue if we want to change that at some point.
With:
Aside: changing this might be helpful for fuzzing.
Edit: it looks like, 3 hours before this ticket was opened, commit https://github.com/crashappsec/libcon4m/commit/288e79d4c0fecc2219b0ca6c1d804dc0e4155be4 in the
jtv/dev
branch changed this behavior. I hadn't seen that commit when I opened this ticket. With that commit, valgrind produces the same complaint forinvalid.c4m
as it does for other files:So it looks like the future PR for that branch can close this ticket.