andgineer / TRegExpr

Regular expressions (regex), pascal.
https://regex.sorokin.engineer/en/latest/
MIT License
174 stars 63 forks source link

small bug with OP_ANYML? #199

Closed Alexey-T closed 4 years ago

Alexey-T commented 4 years ago
      OP_ANYML:
        begin // ###0.941
          if (regInput = fInputEnd) or
            ((regInput^ = fLinePairedSeparatorHead) and
            ((regInput + 1)^ = fLinePairedSeparatorTail)) or
            IsCustomLineSeparator(regInput^)
          then
            Exit;
          Inc(regInput);
        end;  

OP_ANYML is generated always, even when paired line breaks are NOT used!

    '.':
     begin
      if fCompModifiers.S then
      begin
        ret := EmitNode(OP_ANY);
        flagp := flagp or flag_HasWidth or flag_Simple;
      end
      else
      begin // not /s, so emit [^:LineSeparators:]
        ret := EmitNode(OP_ANYML);
        flagp := flagp or flag_HasWidth; // not so simple ;)
      end;
     end;

so when paired line ends are NOT used, we generate OP_ANYML which reads fLinePairedSeparatorHead/Tail which are NOT inited. all works because not inited chars are #0.

how to fix: test fLinePairedSeparatorAssigned in OP_ANYML handler.

Alexey-T commented 4 years ago

made the fix. test_benchmark shows now its little faster: 2 runs of old test, 2 runs of new test. Screenshot from 2020-09-01 09-37-04