PCRE2Project / pcre2

PCRE2 development is now based here.
Other
921 stars 194 forks source link

Segmentation Fault of `pcre2test` #440

Closed Koukyosyumei closed 3 months ago

Koukyosyumei commented 3 months ago

I found a small test case for pcre2test that raises a segmentation fault.

demo.txt

# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.

#subject dfa
#newline_default LF any anyCRLF

/\X?abc/utf,no_start_optimize
x{100}\x{100}\x{100}\x{100}\x{10    \xff\x7f\x00\x00\x03\x00\x41\xcc\x80\x41\x{300}\x61\x62\x63\x00\=no_utf_check,offset=06

This input causes the following:

$  ./build/pcre2test < crash/demo.txt
PCRE2 version 10.45-DEV 2024-06-09 (8-bit)
# This set of tests checks UTF and Unicode property support with the DFA
# matching functionality of pcre2_dfa_match(). A default subject modifier is
# used to force DFA matching for all tests.

#subject dfa
#newline_default LF any anyCRLF

/\X?abc/utf,no_start_optimize
x{100}\x{100}\x{100}\x{100}\x{10    \xff\x7f\x00\x00\x03\x00\x41\xcc\x80\x41\x{300}\x61\x62\x63\x00\=no_utf_check,offset=06
Segmentation fault
zherczeg commented 3 months ago

Could you try PCRE2_MATCH_INVALID_UTF? Your input does not look like a valid utf, and you PCRE2_NO_UTF_CHECK as well, so you might get crashes as expected.

PhilipHazel commented 3 months ago

Your input is not valid UTF. Without NO_UTF_CHECK you get the error "UTF-8 error: illegal byte (0xfe or 0xff)". If you give invalid UTF to PCRE2 and set PCRE2_NO_UTF_CHECK the result is undefined. This is documented. PCRE2_NO_UTF_CHECK is a dangerous option that should be used only when you know the input is valid - for example, on the second and subsequent times you scan the same subject.