mysticatea / regexpp

The regular expression parser for ECMAScript.
MIT License
153 stars 15 forks source link

Visitor: improve usability #20

Open conartist6 opened 3 years ago

conartist6 commented 3 years ago

The current visitor has some problems. The biggest is that there's no way to visit every expression, where an expression might be the content of a group or of a lookahead or lookbehind. All expressions have common needs -- in particular the alternatives need to be evaluated to know if the expression matches, after which different actions are appropriate depending on the specific type. The worst offender is the assertion type, because it may or may not be an expression at all, depending on whether the kind is lookahead or lookbehind. This may be fine for a backtracking engine since these can evaluate the assertion inside their own block scope, but it makes life fabulously difficult for non-backtracking engines which must always track the states until the input reveals whether the subexpression matches or fails.

conartist6 commented 3 years ago

OK I was able to get much better results by writing my own visitor. I tried a couple variations, but what ended up being best was going all the way back to the simplest basics where a visitor for a type is responsible for propagating the visit forwards itself. This had huge advantages:

I do pay a bit of a cost of course as I have to maintain the code that propagates the traversal forwards, but having the return values ensures it's pretty obvious if something is missing or incorrect.

Here is my traversal code.

MichaelDeBoey commented 1 year ago

Hi @conartist6!

Since this repo is unmaintained, you might want to re-open this issue in the @eslint-community fork https://github.com/eslint-community/regexpp

For more info about why we created this organization, you can read https://eslint.org/blog/2023/03/announcing-eslint-community-org

conartist6 commented 1 year ago

@MichaelDeBoey Cool, I should probably upgrade my projects to use the one that you're maintaining then!

MichaelDeBoey commented 1 year ago

@conartist6 It's indeed advised to update your projects to use the community fork instead of using the unmaintained original project

conartist6 commented 1 year ago

@MichaelDeBoey Ah thanks! I actually went the other way and wrote my own regex parser, uh, and also my own AST data structure and my own parser framework

MichaelDeBoey commented 1 year ago

@conartist6 That's another possibility of course 🙈