Closed carenas closed 1 year ago
I cannot get the interpreter to fail with link size = 3 or 4 though it does fail with JIT. Please can you give details of your interpreter failure?
My bad; I got my test mixed up and you are correct that there are no failures in the automated tests, neither in my synthetic tests that could be attributed to the interpreter. Had update the issue to reflect that, apologies again for the confusion.
There is a typo in the code, but fortunately it revealed a good question: https://github.com/PCRE2Project/pcre2/blob/master/src/pcre2_internal.h#L1782C1-L1782C1
The reverse
use LINK_SIZE
for storing the number of characters, while vreverse
uses IMM2_SIZE
. I would use the same, probably IMM2_SIZE
.
I think I originally used LINK_SIZE for OP_REVERSE because I thought it might just be possible that someone with an enormous subject string would want to look behind more than 65535 characters when LINK_SIZE is greater than 2. For OP_VREVERSE there is a very small limit, which is probably why I used IMM2_SIZE. However, I don't mind changing OP_REVERSE if you think this should be done.
I would prefer consistency here. It is easy to forget about this difference in new code.
Note though that the current small limit that applies to OP_VREVERSE is a constrain coming from Perl, which might change in the future since "variable lookbehind" is still experimental AFAIK, so it might be better to change OP_VREVERSE to use LINK_SIZE
instead of the other way around.
I am ok with any direction. Just use the same size.
I have changed OP_REVERSE to use IMM2_SIZE in b3dbe8d.
src/pcre2_printint.c
This change is not good, because it affects all the other opcodes:
case OP_COND:
case OP_SCOND:
case OP_REVERSE:
if (print_lengths) fprintf(f, "%3d ", GET2(code, 1));
Oops. Now fixed.
Sadly with the latest fixes, Test 6 is now failing if link-size==4, didn't check with 3, and don't recall this being the case before, but they are all lookbehinds so likely related.
I also mentioned it to Philip. Those are dfa tests. I hope we don't overload him.
I have fixed the DFA problem - I overlooked one case when changing OP_REVERSE to IMM2. Thank you for worrying about overloading me. There is no deadline for 10.43, so it will come when it comes. I will continue to work slowly through the issues as time permits - what will take time is #334 because I cannot remember enough about the way recursion works in the interpreter.
I think this was all a trick to force me to learn more about DFA, at least I know now why moving OP_REVERSE to IMM2 was a better path than the alternative I suggested, even if I didn't catch the overlooked change (and I did try, I even think I tried it as a shot in the dark and didn't help, but I probably just got my test mixed up as usual).
working fine as shown by manytests
in CI
as shown by failures on tests 2, 4 and 5 (when configured with
--with-link-size=4
).link-size=3 also shows additional failures in test 1 (with JIT) which might be also related