DmitrySoshnikov / regexp-tree

Regular expressions processor in JavaScript
MIT License
401 stars 43 forks source link

Incorrect results when quantifier range unsafe integer #222

Open tjenkinson opened 3 years ago

tjenkinson commented 3 years ago

E.g

/a{9007199254740993}/

gives

{
  "type": "Quantifier",
  "kind": "Range",
  "from": 9007199254740992,
  "to": 9007199254740992,
  "greedy": true
}

Note 9007199254740992 not 9007199254740993.

I wonder if the parser should throw if !Number.isSafeInteger(x)?

tjenkinson commented 3 years ago

Or if numbers this size are actually handled by engines(?) maybe we should also expose a string version of the number

DmitrySoshnikov commented 3 years ago

Thanks, yeah, this is a weird edge-case. An agual JS regexp is actually generated with correct number /a{9007199254740993,9007199254740993}/, so from the code generator perspective it might be good to support it.

We can probably check if the number is safe, store the number, otherwise a string. Or add the new string fields as you suggests (not obvious though what will be kept in the number field).

tjenkinson commented 3 years ago

Maybe Bigint could help here? Although it's complicated with it not being supported everywhere.

Could be something like:

DmitrySoshnikov commented 3 years ago

Yeah, we need something predictable and working on most of the case. I think we can just make types of the from and to fields to be strings, since we need to support correct code generation, back from having AST.