Closed siegel closed 3 months ago
I don't remember why the code returns with no memory for unsupported features. It does not make sense. Maybe there was no better error code. Adding new error codes will be a new feature.
Auto callouts are supported in general, but for conditions, we probably need to save/restore several registers. Implementing this would be a new feature as well.
I don't remember why the code returns with no memory for unsupported features. It does not make sense. Maybe there was no better error code. Adding new error codes will be a new feature.
Should I create a new issue for that specific feature? (I'm happy to make an argument that it's a bug, but if you're doing the work and consider it a feature, I will write it up as a feature. :-) )
Auto callouts are supported in general, but for conditions, we probably need to save/restore several registers. Implementing this would be a new feature as well.
Can you elaborate on this a bit? I'm not clear on what exactly is occuring at that point in the pattern compilation.
(Also, if you consider support for autocallouts after conditionals to be a feature to be added, should I likewise write up a new issue for it? How should I title it?)
You can open reports if you wish, but it is not necessary.
The only thing I remember is that the conditionals with callouts was a complex case. Something was needed to do for conditionals before and after the callout happens. The interpreter also does some specialization:
https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_match.c#L5627
I have checked the callout problem. The main issue is that the callout can escape from conditional before the condition is checked. Supporting cond/scond is already requires a lot of specialization in the code, and supporting iterated conditional with an extra escape is a rather complex case. Introducing bugs in the code is very likely.
I appreciate you taking the time to look into the feasibility.
I ended up resolving this issue by turning off PCRE2_AUTO_CALLOUT
for these patterns and instead generating the pattern with explicit (?C) callouts (which are fine for my needs), and further by reworking the pattern to use (?s).|\z
instead of \C
(the latter which fails to JIT compile when the pattern was compiled with PCRE2_UTF
, as mine are).
So I think this issue can probably be closed leaving #441. :-)
I'll be reviewing #441 later today. Closing this issue now.
Version info from
pcre2.h
:Platform: macOS (Darwin), 14.6, arm64. (The OS version is inconsequential; I've not yet tested on Intel but it was happening there too, as of a long time ago.)
At runtime, I programmatically generate this pattern:
The code which does this is fairly intertwined, but what gets run looks like this (NB: I am using
pcre2_16
):At the time this runs,
options
may vary, but in this case its value is0xc00a0404
. Notably, this includesPCRE2_AUTO_CALLOUT
.When compiling the generated pattern above, the call to
pcre2_jit_compile
returnsPCRE2_ERROR_NOMEMORY
(-48). This isn't a new issue, and most of the time life goes on just fine without JIT. But it's been bothering me that I can't JIT these patterns, so I started to debug it.To make a very long story short, I found that on line 14283 in
pcre2_jit_compile.c
it returnsPCRE2_ERROR_NOMEMORY
ifcheck_opcode_types()
returnsFALSE
:So I set breakpoints in the places in that function where there's a
return FALSE;
. And this one fires, on line 1167:My pattern is indeed compiled with
PCRE2_AUTO_CALLOUT
, so the callout opcodes will be in the compiled pattern.This strikes me as odd, because
pcre2_jit_compile()
does support ordinary callouts:Would it be possible to relax the restriction, so that
pcre2_jit_compile()
allows auto-callouts?Additionally, the documentation isn't quite in agreement:
pcre2_jit_compile
says:It doesn't mention that
pcre2_jit_compile(3)
will also returnPCRE2_ERROR_NOMEMORY
for other specific reasons, such as if the pattern uses auto-callouts.pcre2callout
says that callouts are supported by JIT:This suggests that perhaps the behavior I'm currently encountering is a bug; but if it's intentional the documentation needs to clearly state that
PCRE2_AUTO_CALLOUT
may cause JIT compiles to fail.So:
I think it makes sense for
pcre2_jit_compile()
return specific codes for specific failures, not just "can't allocate memory" (which is a lie in this case; there was no allocation failure, just a bad opcode).It appears that
pcre2_jit_match()
supports callouts generally. So is it possible that the auto-callout test incheck_opcode_types()
is erroneous, and that auto-callouts ought to be allowed at that point?If auto-callouts are not to be supported in JIT-ed patterns, I'd like to have a flag to pass to
pcre2_jit_compile()
that says "I don't care whether this contains auto-callouts; compile it anyway and NOP them." (I don't know how practical that is, though.)Thanks for reading. :-)