mysticatea / regexpp

The regular expression parser for ECMAScript.
MIT License
153 stars 15 forks source link

Missing support for js-supported character class #25

Closed mattbishop closed 2 years ago

mattbishop commented 3 years ago

I want to parse this regex:

/\p{ID_Start}\p{ID_Continue}+/u

Node 16 accepts it:

const idRegex = /^\p{ID_Start}\p{ID_Continue}+$/u
console.log(idRegex.test("anIdentifier")
// > true
console.log(idRegex.test("not an Identifier")
// > false

regexpp 3.2.0 does not:


regexpp.validateRegExpLiteral(/^\p{ID_Start}\p{ID_Continue}+$/u)
// > RangeError: Invalid code point -1
at Function.fromCodePoint in ECMAScript
at RegExpValidator.validateLiteral in regexpp/index.js — line 411
at Object.validateRegExpLiteral in regexpp/index.js — line 2084

These two character classes are important for programming language parsing: https://unicode.org/reports/tr31/
ota-meshi commented 3 years ago

Hi @mattbishop . validateRegExpLiteral accepts string. I think may you need to do the following:

regexpp.validateRegExpLiteral(/^\p{ID_Start}\p{ID_Continue}+$/u.toString())
conartist6 commented 2 years ago

@mattbishop is this issue resolved?

mattbishop commented 2 years ago

I don't think so. No changes to the code have been added for 10 months.

On Wed, Apr 6, 2022 at 6:53 AM Conrad Buck @.***> wrote:

@mattbishop https://github.com/mattbishop is this issue resolved?

— Reply to this email directly, view it on GitHub https://github.com/mysticatea/regexpp/issues/25#issuecomment-1090299781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACY3BMLRZGL3ES3Y6FNWILVDWJLXANCNFSM5EIMCBQQ . You are receiving this because you were mentioned.Message ID: @.***>

conartist6 commented 2 years ago

@mattbishop I am able to confirm what ota-meshi said (that there is no bug). See: https://runkit.com/conartist6/624da94f206e790009a8a78b

RunDevelopment commented 2 years ago

As document here (and here), validateRegExpLiteral takes a string as its first argument, not a regex.

mattbishop commented 2 years ago

Ah I see now. I thought it could take a RegExp like parseRegExpLiteral. I will close this issue but it seems odd that parse takes both string and Regexp while validate only takes string.

Thanks for the clarification!