no-context / moo

Optimised tokenizer/lexer generator! 🐄 Uses /y for performance. Moo.
BSD 3-Clause "New" or "Revised" License
814 stars 65 forks source link

Having "-" as literall match thingies causes invalid escape sequences to be generated when using the unicode #144

Open TheGrandmother opened 3 years ago

TheGrandmother commented 3 years ago

If i have the following token defined: ARITHMETIC: ['+' , '-'], And all my regular expressions are using the u flag moo will output an an invalid regex escape sequence as \- is apparently uncool when rolling with the u flag. Changing the token definition to: ARITHMETIC: ['+' , /-/u], works but is not pretty.

moo version 0.5.1 Node version 13.14.0

tjvr commented 3 years ago

Ooh, that's surprising!

nathan commented 3 years ago

@tjvr simplest fix is to escape - as \x2d (which works everywhere) instead of \- (which only works inside [] in Unicode mode, even under Annex B).

It's incomprehensible why TC39 decided to make the RegExp grammar this inconsistently nitpicky (e.g., \{ works fine inside a Unicode character class even though the backslash does nothing).

tjvr commented 3 years ago

If someone wants to raise a PR which updates reEscape (or whatever the function’s called), that would be great.

fabiosantoscode commented 2 years ago

I've worked around this by simply changing '-' to /-/u, putting this out here in case it's helpful.