Open frostburn opened 6 months ago
I think the actionable part here is allowing non-BMP characters in a character class to work as expected. The current behavior is useless, so backward-compatibility with older Peggy/peg.js versions is not needed. The rule given above gets generated with:
var peg$r0 = /^[#x\u266F\uD834\uDD2A]/;
var peg$e0 = peg$classExpectation(["#", "x", "\u266F", "\uD834", "\uDD2A"], false, false);
Both of which are wrong. This should generate:
var peg$r0 = /^[#x\u266F\uD834\uDD2A]/u; // JS backward-compat issue!
var peg$e0 = peg$classExpectation(["#", "x", "\u266F", "\uD834\uDD2A"], false, false);
or:
var peg$r0 = /^[#x\u266F\u{1d12a}]/u; // JS backward-compat issue!
var peg$e0 = peg$classExpectation(["#", "x", "\u266F", "\u{1d12a}"], false, false);
(see: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/unicode)
Alternately, this could be automatically turned into the following rule:
SharpAccidental = [#x♯] / "𝄪"
If we are dropping IE11 support in 4.0, the first solution is probably better, but we should only put a /u
at the end of regeular expressions that need it; there may be performance changes at the least.
I'm unsure if conditional /u
won't lead to unexpected behavior. IMO it should be a generator option, and as such can be already supported. I'd rather change to /u
by default in 4.0.
Let's talk about browser support in #463. Let's do some benchmarking before making a final decision on implementation approach?
The way unicode codepoints are chunked in Peggy patterns doesn't follow visual chunking.
Example grammar:
sharp.peggy
Unexpected result:
Reasonable behavior in node using spread syntax: