Closed addisoncrump closed 1 year ago
The issue can also be observed with (B)*+A
and without endanchored:
sh-5.2$ xxd no_endanchored_crash
00000000: 0000 0000 0000 0000 2842 292a 2b41 ........(B)*+A
sh-5.2$ ./pcre2_fuzzer no_endanchored_crash
Encountered failure while performing match errorcode comparison; context:
Pattern/sample string (hex encoded): 2842292a2b41
Compile options 00100000 never_backslash_c
Match options 00002000
Non-JIT'd operation did not emit an error.
1 matches discovered by non-JIT'd regex:
Match 0 (hex encoded): 41
JIT'd operation did not emit an error.
2 matches discovered by JIT'd regex:
Match 0 (hex encoded): 41
Match 1 (hex encoded): 42
These are hard to understand. Is the sample sting is the same as pattern? What are the dots do before the pattern? Would it be possible to write the offsets for the match? Could it print the flags for the match options
?
The 0x20 is part of the flag data.
$ ./pcre2test -jit
PCRE2 version 10.43-DEV 2023-04-14 (8-bit)
re> /(A)*+/never_backslash_c,endanchored
data> (A)*+
0:
1: A
data> (A)*+\=no_jit
0:
In the future, I'll submit reproduction cases with pcre2test
instead for clarity :slightly_smiling_face: Apologies for the confusion.
It also appears this way: /(A)*+$/
I suspect reverting the iteration does not work. I will check later.
In the future, I'll submit reproduction cases with
pcre2test
instead for clarity
thanks; it would be also a good idea to remove "never_backslash_c" from it, as that is also likely unrelated and only needed by the fuzzer to avoid crashing in unlucky input AFAIK.
in this case, using the PCRE2_ENDANCHORED flag at match time (which might be the correct way to use this) also seem to "fix" JIT as shown by:
% ./pcre2test -jit
PCRE2 version 10.43-DEV 2023-04-14 (8-bit)
re> /(A)*+/endanchored
data> (A)*+\=endanchored
0:
data> (A)*+\=endanchored,no_jit
0:
data>
This is a bug in jit. I forgot a braposzero check. I will make a patch for it.
This is a bug in jit
it seems that it is broken all the way back up to the no longer maintained 8.x series as shown by:
% pcretest -s++
PCRE version 8.45 2021-06-15
re> /(B)*+A/
data> (B)*+A
0: A (JIT)
1: B (JIT)
Fixed in b43e745298ff80368c1e5c674d5f74c76bc4c13e
Not seeing any unintended side effects from the fix... Closing this :slightly_smiling_face:
Discovered with #322.
The regex
(A)*+
, withA
as any literal, does not have consistent matches between JIT and non-JIT when used with endanchored.Disabling endanchored causes the difference to disappear. I'm not sure what the correct behaviour should be.