Closed User4martin closed 10 months ago
@User4martin do we still need:
OP_OPEN = TREOp(50); // Opening of group
OP_CLOSE = TREOp(51); // Closing of group
OP_OPEN_ATOMIC = TREOp(52); // Opening of group
OP_CLOSE_ATOMIC = TREOp(53); // Closing of group
Yes.
Only (?:
was changed.
A normal (
must still omit an opcode => so the capture can be saved (and it must support backtracking (in 99% of cases) so it needs recursion).
(?>
still produces OP_OPEN_ATOMIC. And that is also still needed (Though there is a limited subset of cases, where it could be optimized...)
a(?>bb|b)b
on text abb
=> will NOT match, as the atomic eats both b
, and the b after the bracket has nothing to match.a(?>bb|b){1,3}b
. In this very specific case the atomic could (afaik) be provided by the loop a(?:bb|b){1,3}+b
. This is only the case if the atomic bracket is exactly the entire loop.EDIT: scratch the last one. This is only the case for {1,1}+
.
A normal ( must still omit an opcode
Omit VS Emit. you mean 'emit'.
Most savings are from "Don't omit OP_OPEN/CLOSE for (?: " (see the +++ in the table) Yet their is a bit of speed gain through the other commits too.
A
(?:
has only an effect on either encapsulating branchs|
or loops => in both cases - once parsed - the op codes of those correctly represent start and end of the sub-pattern. Of course the output ofDump
is less human readable. But that must be expected, if an optimizer makes changes.(?:a|b)|(?:\d|\w)\s|(?:12)+
The OP_COMMENT is needed, if a branch has several sub-expr. They all need to point to an
ender
before returning to the parent (removing this is possible, but currently to complicated) The last segment has no OP_COMMENT, as there is no OP_BRANCH either. The (?: is handled by the loop.In case of
1(?:ab)2
the brackets have no meaning. So the code is just inlined: