GerHobbelt / jison

bison / YACC / LEX in JavaScript (LALR(1), SLR(1), etc. lexer/parser generator)
https://gerhobbelt.github.io/jison/
MIT License
118 stars 20 forks source link

Now that we use recast->esprima, we can simplify the lexer grammar for regexes and action blocks #18

Open GerHobbelt opened 6 years ago

GerHobbelt commented 6 years ago

Re-use the (customized) esprima parser that comes with recast to help us parse lexer regexes and lexer and parser action blocks: we can re-use the esprima scanner for those parts!

Suggestion: use a simple lexer regex to match the start of the regex or action code block input and then consume as much as necessary using the esprima scanner to produce the appropriate jison-lex / jison grammar token.

GerHobbelt commented 6 years ago

Sideways related: #22 -- both these issues are about using the full power of recast+esprima in our code generator(s).

GerHobbelt commented 6 years ago

Also related: #17 -- using esprima+recast means we restrict ourselves to a tighter controlled set of accepted action codes: old jison didn't care all that much as long as you merely pooped out generated source code (it was, and is, quite another matter when you produce parsers live, though!)

GerHobbelt commented 6 years ago

Progress: ditch esprima (that was a bad choice, in that esprima doesn't track modern JS development as well as babel & friends. Consequently the choice for recast is dubious as it turns out babel/parse produces an ever so slightly different AST (some node types differ from esprima and b0rk recast), while the recast intention could be mimicked using a babel transform plugin -- iff you forego the recast/print exact source copy for unaltered content feature.

Not 100% sure if I should ditch recast as well... pretty sure it's the Sunk Cost Fallacy at work in my brain, though. :grin:


Key goal is to enable jison to check action code early and as much as possible so that action code mistakes (typo's, variable references, etc.) can be reported soon and precisely related to jison user grammar source files plus the ability to accept and produce modern JavaScript in any part of your grammar spec code sets.

Since the generally accepted process for this is compiling your stuff through babel, jison should do the same, as babel closely tracks modern JS development and provides full downwards compatibility in the generated output (babel-transform).

Recast may be nicer in that it has a nicely documented and stable API (babel explicitly states they keep the plugin API spec adaptable to enable swift and clean compiler development 👍 + 💘? ) but swapping esprima for babel/parse triggered several nasty b0rks in my codebase, so I am on the fence deciding to go with babel+recast or babel+babel-custom-plugin to make it happen. Where 'it' is transforming all action blocks + jison grammar spec into one generated JS file that's ready to run.