jviereck / regjsparser

Parsing the JavaScript's RegExp in JavaScript.
http://www.julianviereck.de/regjsparser/
BSD 2-Clause "Simplified" License
79 stars 20 forks source link

Add `unicodeSet` option for the RegExp set notation proposal #113

Closed nicolo-ribaudo closed 3 years ago

nicolo-ribaudo commented 3 years ago

https://github.com/tc39/proposal-regexp-set-notation

I implemented support for the proposed v flag, which can only be used when the unicodeSet: true option is enabled. I updated the AST in a bakward-compatible way: characterClass nodes now have a kind property that can be "union" (default), "intersection" or "subtraction".

If you have ideas for new tests please let me know!

After that this PR is merged, I will add support in regjsgen and regexpu-core.

cc @mathiasbynens @bnjmnt4n @jviereck (no hurry for reviews, since starting from Wednesday I'm on vacation for two weeks)

jviereck commented 3 years ago

Thanks for your PR! I am on holidays this week and will look at this next week.

jviereck commented 3 years ago

This looks very good. I took a closer look at the PR and there were zero review comments needed (!). Well done @nicolo-ribaudo .

@nicolo-ribaudo do you have any second thoughts in the meantime about this PR or do you want me to merge and create a new release?

nicolo-ribaudo commented 3 years ago

Awesome, thanks!

During the last TC39 meeting (last week) there has been some discussion about this proposal. From what I understood it was only about semantics and not about the syntax, but I'd like to wait for @mathiasbynens to confirm it.

mathiasbynens commented 3 years ago

Awesome, thanks!

During the last TC39 meeting (last week) there has been some discussion about this proposal. From what I understood it was only about semantics and not about the syntax, but I'd like to wait for @mathiasbynens to confirm it.

Confirmed. The main syntax hasn't changed. It might of course still change on the way to Stage 3 (as with all proposals) but high-level changes are unlikely at this point. I would love to see this land!

nicolo-ribaudo commented 3 years ago

This PR considers the u and v flags as independent. Is it still correct?

If the answer is yes, then this is good to land and I'll update the regjsgen PR :+1:

mathiasbynens commented 3 years ago

This PR considers the u and v flags as independent. Is it still correct?

They are independent, indeed. The only "gotcha" is that they cannot be combined, i.e. /./uv throws. Background: https://github.com/tc39/proposal-regexp-set-notation/issues/23#issuecomment-912368611

nicolo-ribaudo commented 3 years ago

Ok, I have to add that error 👍

nicolo-ribaudo commented 3 years ago

@jviereck If the last commit looks good this can be merged.

jviereck commented 3 years ago

I released v0.7.0 including this PR, which was published to npm.

nicolo-ribaudo commented 2 years ago

@jviereck I didn't notice that the proposal was updated, and now uses \q{ab|cd} for string alternatives rather than (ab|cd). I can open a PR to update it tomorrow (or probably on Tuesday), if you want to wait to release the next version until this is fixed.

jviereck commented 2 years ago

Thanks for the heads-up @nicolo-ribaudo . I will wait for your change before drafting a new release then.