Closed User4martin closed 1 year ago
Next := scan + PRENextOff(AlignToPtr(scan + 1))^; // inlined regNext;
that is reading badly. can you add functiion with inline, which can replace regNext?
Next := scan + PRENextOff(AlignToPtr(scan + 1))^; // inlined regNext;
that is reading badly. can you add functiion with inline, which can replace regNext?
I can... But I am doubtful about it.
e.g.
OP_EXACTLY_CI:
begin
opnd := scan + REOpSz + RENextOffSz; // OPERAND
Len := PLongInt(opnd)^;
PLongInt
doesn't even read PReDataLength
as for example PReOpLookBehindOptions
.
All access to the program is done directly. RegNext is special during compile. Because in the first pass it needs to act as dummy.
next
as they need (if they are of fixed len, they still don't need a next field).next
is needed, then an OP_GOTO can do that.IMHO the important bit is the 3rd one: None of the other code is outsourced to functions.
why we cannot make new func
function regNextInlined(scan: ...): ...; {$ifdef InlineFuncs} inline; {$endif}
begin
Result := scan + PRENextOff(AlignToPtr(scan + 1))^;
end;
and use it in proposed places? it is the same as your code but readable.
also, your change (with new func or not) changes the pointer, even in pass-1. before, pointer wasn't changed in pass-1. is it OK?
why we cannot make new func
As I said, we can. I simply stated my reasons for not doing it in first. If you still think it's better to have it => no problem
also, your change (with new func or not) changes the pointer,
Where?
Where?
sorry, my mistake.
However I did miss something
if offset = 0 then
Result := nil
which is used in MatchPrim
while scan <> nil do
Error(reeMatchPrimCorruptedPointers);
But, IMHO pointless: If the mem is corrupted, it has a 1 in 256 chance of being a zero byte.
We can add a new define {$DEFINE WITH_ASSERTS}
And then check for 0, and for the new value being in the Program (regCodeSize).
But IMHO there is no point to run this by default. So I would remove the still existing remains of that check, and put them in ifdef.
sorry about the indentation.
I keep synchronizing 2 copies of the file, and I can't compare space-changes, because there are thinks like trailing spaces that don't exist in both copies.
Merged? I am confused now, I though I was still changing the code to an inlined function?
i merged because I thought it's ready fix. please fix more if needed.
change is needed here?
so
repeat
FillFirstCharSet(scan + REOpSz + RENextOffSz);
scan := scan + PRENextOff(AlignToPtr(scan + 1))^; // inlined regNext(scan);
until (scan = nil) or (PREOp(scan)^ <> OP_BRANCH);
will run longer than before, in pass-1, right? scan = nil
will not be true.
it means the inlined-func, as i suggested, is better?
Martin, I made regNextInlined
. but feel free to propose fix again over current code. but patch should not slowdown the pass-1 as before.
Well, its fine as it is. I was looking at the inlined function, as you preferred it.
About the = nil
check.
It cannot happen in a real run. Only if memory is corrupted, and then it is likely to be <> nil
anyway.
But I could/would place assert code in (with more checks that just =nil
).
Do you prefer
assert()
(toggled by -Sa
){$IFDEF WITH_REGEX_ASSERT}
What is the existing {$DEFINE USE_ASSERTS}
for? It does not seem to be used at all?
repeat FillFirstCharSet
will run longer than before, in pass-1, right? scan = nil will not be true.
The repeat
you quoted is actually inside FillFirstCharSet
.
The call to FillFirstCharSet
is from CompileRegExpr
but only after both passes have finished. (line 3165)
All other invocations of FillFirstCharSet
are recursive.
So yes, that is safe.
Martin, I made regNextInlined
Ok, things moving to fast. I would have done it myself. But I can barely finish one reply before the next :)
I made regNextInlined
Ok, but the entire point was to remove
if p = @regDummy then
The inlined version is never called in the first pass.
It could have an
assert(fSecondPass); // fSecondPass will also be true in MatchPrim.
I'll do a new pull request.
Just let me know
Do you prefer
actual assert() (toggled by -Sa) {$IFDEF WITH_REGEX_ASSERT}
What is the existing {$DEFINE USE_ASSERTS} for? It does not seem to be used at all?
changed regNextInlined. OK now?
What is the existing {$DEFINE USE_ASSERTS} for? It does not seem to be used at all?
IDK, seems the leftover from old authors.
changed regNextInlined. OK now?
Yes.
Though, as I said, I strongly propose to move the if offset = 0 then
into an IFDEF.
For fpc this works
{$IFDEF FPC}{$IFOPT C+} {$DEFINE UseAsserts} {$ENDIF}{$ENDIF} // Only if compile with -Sa
Not sure how to change that for Delphi?
Also, maybe better rename to WITH_REGEX_ASSERT
. Which users are less likely to accidentally set?
okay, so what is proposed change now? I'm puzzled.
Moved to #342
Those errors should never happen. IMHO they should be assert.
And it has a noticeable effect on speed.
Ah, okay..
Various very very small speed improvements.
Off topic to the pull request.
I started an experimental change https://github.com/User4martin/TRegExpr/commit/a09083f90301449bda27ad7195a835440cc3a06e
If you have
/[02468]/
, it may sometimes be faster to first check the lowest and highest, before iterating the entire list.But
/[\n0aAzZ]/
then the bounds include almost everything, and don't add benefit.I can't yet tell if the changes provide real benefit. The benchmark doesn't have much ranges. One of my real apps, which runs for about 1 minute appears to save 0.2 or maybe 0.3 seconds => however that may be wishful thinking, as the jitter in the timing is bigger that that (and given the time it takes, I have only very limited sample runs...)
There may be potential, if...
[AB\dCDEF]
would join the 2 OpKind_Char into one['\d]
where'
may have an ordinal very close to 0-9. Therefore the entire[]
could start with a OpKind_MinMaxGuard, that would give the min/max for entering the group at all. (Again depends on the resulting density)But for now, it's just experimental.