Closed addisoncrump closed 9 months ago
diff --git a/src/pcre2_match.c b/src/pcre2_match.c
index b2e1f23..c551a92 100644
--- a/src/pcre2_match.c
+++ b/src/pcre2_match.c
@@ -5441,8 +5441,8 @@ fprintf(stderr, "++ %2ld op=%3d %s\n", Fecode - mb->start_code, *Fecode,
P = (heapframe *)((char *)N - frame_size);
if (N->group_frame_type == (GF_RECURSE | number))
{
- if (Feptr == P->eptr && mb->last_used_ptr == P->recurse_last_used)
- return PCRE2_ERROR_RECURSELOOP;
+// if (Feptr == P->eptr && mb->last_used_ptr == P->recurse_last_used)
+// return PCRE2_ERROR_RECURSELOOP;
break;
}
offset = P->last_group_offset;
You get the same result. Internally you could try to rerun these patterns with disabling those two lines, and if they are the same, this is just the interpreter extra safety line.
For a real infinite pattern (e.g. /(?0)b/
) you get PCRE2_ERROR_MATCHLIMIT / PCRE2_ERROR_JIT_STACKLIMIT when those two lines are disabled (although it is a bit slow for the interpreter).
A suggestion: instead of adding an option, you could use an environment variable (see getenv
) to enable / disable that two lines in the fuzzer. This way you need to apply a minor patch to the codebase, and in these cases, you can rerun the pattern with setting that environment variable.
There are probably always going to be inconsistencies for cases where resource-limiting checks kick in.
Would it be appropriate to surround this check with a ifdef for fuzzing mode enabled?
I have implemented this here and can revert: https://github.com/PCRE2Project/pcre2/pull/322/commits/7e34f7e40360e6a84c6227787660a14bcd4a5819
I'm sort of neutral about this. Without this test, as @zherczeg pointed out, some tests in the interpreter can take a long time to fail. Does the fuzzer reduce the default value for the match limit?
It does, yes.
That's good. However, after a bit more thought, I have another reservation, which may just be over caution. The --enable-fuzz-support config option causes pcre2fuzzcheck to be built. I am not sure that it should also cause changes to the actual library. My normal working build includes fuzz support so that I can test fuzz reports without a rebuild, but I would not want it to have the code alteration. Hmm. A run-time check is definitely a bad idea. Perhaps we also need --disable-interpreter-recurseloop-check? This should probably also apply to pcre2_dfa_match() which has the same logic as pcre2_match().
Whatever we do, infinite loop patterns are not real use cases. They have no practical value. We can add various early checks for them (lot of effort, slow things down), but people will always be able to construct a pattern, which runs forever, and not caught by any checks. So in the end match limit is the only reliable way to stop the matching at some point.
I agree, but I think it is worth including the simple checks that are in the interpreters because they catch the easy cases rather than using a lot of resources before failing.
Okay, so should this be a separate PR to allow the configure to disable the infinite recursion checks?
I'm just wondering if we could get away with a run-time check instead. The current test can remain - but if some new runtime option is set, do not give the error. This would not impact on performance because in normal use the error would happen. The option could be PCRE2_DISABLE_RECURSELOOP_CHECK and would apply only to the interpreters. I think this is also less work than modifying the autotools and CMake configs, and all the associated documentation. And you never know, somebody might find a use for it for real. I'm happy to implement this if you'd rather I did it.
I don't mind implementing it, I'm just not sure what would be most appropriate. That said, if setting a configure option, shouldn't an ifdef/ifndef be used anyway?
I'm suggesting a run-time option, NOT a configure option. That avoids messing with config and saves having to re-compile just to test stuff. It would involve edits to pcre2.h.in, pcre2_match(), pcre2_dfa_match(), and pcre2test.c, all of which should be quite small. Then just a short mention in pcre2api.3, and for completeness, listing the new option in doc/pcre2_match.3 and pcre2_dfa_match.3. I don't think we need an ifdef.
I have just committed the implementation of PCRE2_DISABLE_RECURSELOOP_CHECK as an option for pcre2_match(). The original test above behaves the same as JIT when this option is set. In the end, I did not add it to pcre2_dfa_match() because that function uses recursive function calls to implement regex recursion and turning off the check gives a stack overflow crash.
As this particular issue can now be avoided by setting PCRE2_DISABLE_RECURSELOOP_CHECK, I'm going to close this issue.
auto-callout annotated behaviour
``` re> /a(?0)z||(?0)++/endanchored,auto_callout data> abcd --->abcd +0 ^ a +1 ^^ (?0) +0 ^^ a +7 ^^ | +5 ^^ z +8 ^^ (?0)++ +0 ^^ a +7 ^^ | +14 ^^ End of pattern +5 ^^ z +7 ^ | +8 ^ (?0)++ +0 ^ a +1 ^^ (?0) +0 ^^ a +7 ^^ | +5 ^^ z +8 ^^ (?0)++ +0 ^^ a +7 ^^ | +14 ^^ End of pattern +5 ^^ z +7 ^ | +14 ^ End of pattern +0 ^ a +7 ^ | +8 ^ (?0)++ +0 ^ a +7 ^ | +14 ^ End of pattern +0 ^ a +7 ^ | +8 ^ (?0)++ +0 ^ a +7 ^ | +14 ^ End of pattern +0 ^ a +7 ^ | +8 ^ (?0)++ +0 ^ a +7 ^ | +14 ^ End of pattern +0 ^ a +7 ^ | 0: data> abcd\=no_jit --->abcd +0 ^ a +1 ^^ (?0) +0 ^^ a +7 ^^ | +5 ^^ z +8 ^^ (?0)++ +0 ^^ a +7 ^^ | +14 ^^ End of pattern +5 ^^ z +7 ^ | +8 ^ (?0)++ +0 ^ a +1 ^^ (?0) +0 ^^ a +7 ^^ | +5 ^^ z +8 ^^ (?0)++ Failed: error -52: nested recursion at the same subject position ```Unlike many of the examples in #334, however, this one appears to have inconsistent behaviour for the
(?1)
equivalent:Perl agrees with interpreter that it is "infinite recursion":
I can make an exception for this error code in the fuzzer's check for equivalence, but it would make us unable to detect future inconsistent recursion issues.