andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

Crash with O4 #339

Closed User4martin closed 1 year ago

User4martin commented 1 year ago

Regex writes to incorrect memory (independent of optimization settings).

But O4 means class field re-ordering. And that means an important field moves into the incorrect target pos....

In the first pass regCode := @regDummy; and regDummy: REChar; which is 2 bytes (or 1)

For [abcd] in the first pass EmitRangeChar is called.

        Pointer(AddrOfLen) := regCode;
...
      Inc(AddrOfLen^);

AddrOfLen: PLongInt; means that the Inc accesses 4 bytes from the address of regCode' which is '@regDummy; . There are only 2 (or 1) bytes in regDummy. The remainder is the next field in the class.

Normally the next field is programm: PRegExprChar; which isn't used in the first pass.

But when O4 reorders class fields, this can be any field. In my case it is fSecondPass, which gets changed to true, but nothing is initialized yet.... So as soon as something depends on this => boom.


Easiest fix regDummy: array [0..8] of ' That way it will cover for all up to^QWORDaccess toregDummy`. (Even if 'sizeOf(char) = 1')

More advanced, guard the code with if not fSecondPass or similar.

I haven't reproduced in a small example, but I can in the debugger confirm that fSecondPass is right after RegDummy, and that the indicated inc statement changes fSecondPass.

Alexey-T commented 1 year ago

good finding. thanks. please proved the fix after I apply the biggest PR.

User4martin commented 1 year ago

Ok, I indent to go with

regDummy: array [0..8 div SizeOf(REChar)] of REChar; 

Because that covers all other cases that write to the dummy-pointer (if there are/will be any).

Alexey-T commented 1 year ago

that is good.