andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

OP_Star / FindRepeated and group ref \1 #368

Closed User4martin closed 10 months ago

User4martin commented 10 months ago

Group ref are marked as simple and having width.

              ret := EmitGroupRef(Len, fCompModifiers.I);
              FlagParse := FlagParse or FLAG_HASWIDTH or FLAG_SIMPLE;

However (\b).\1 => no width. (\b)\1*3 on text .. test 2345 => access violation.

Also ^(123)\1*1..a on text 123123123abcde does not match => yet it should. The reason is, that after it found the 3 instances of 123 it needs to go back one instance, but only goes back one single char. (because Op_star assumes char).

The only place were OP_BSUBEXP [_CI] in FindRepeated makes sense, is for possessive matches => they can not go back.


The problem is not if the function should be in FindRepeated. The problem is, if to emit on OP_Star or not.

User4martin commented 10 months ago

Actually, I was hoping to save it for the possessive cases. But they break on zero length.

So for now simply going to remove those flags.

Alexey-T commented 10 months ago

However (\b).\1 => no width. (\b)\1*3 on text .. test 2345 => access violation

Is it in the tests now?

User4martin commented 10 months ago

No, if it was they would break - it isn't fixed yet. I just got them in the "test_dlg".