kach / nearley

📜🔜🌲 Simple, fast, powerful parser toolkit for JavaScript.
https://nearley.js.org
MIT License
3.57k stars 232 forks source link

Add option to auto-unwrap rule results #505

Open raphinesse opened 4 years ago

raphinesse commented 4 years ago

Summary

This PR adds an option @autoUnwrap. If enabled, rules that consist of a single symbol will have their results unwrapped automatically (like adding {% id %} to them).

Update: In the meantime, I also implemented this feature as a stand-alone package: nearley-auto-unwrap.

Motivation

@autoUnwrap true

example -> FOO ( BAR | BAZ )
FOO -> "foo"i
BAR -> "bar"i
BAZ -> "baz"i

Before this PR, parsing the string foobar with the above grammar would yield the following result:

[ [ 'foo' ], [ [ 'bar' ] ] ]

With this PR, the result will be

[ 'foo', 'bar' ]

IMHO this makes postprocessing much more intuitive or completely alleviates the need for it in some cases.

@autoUnwrap true also fixes #498. It generally improves macro usability and makes nested macros feasible:

@autoUnwrap true

id[x]   -> $x
main    -> id[id["foobar"]]

The result of parsing foobar with above grammar is "foobar" while with @autoUnwrap false the result is [[[[[ "foobar" ]]]]] (that is a nesting depth of 5).

Implementation

The implementation is pretty straight forward: during compilation, if @autoUnwrap is true, modify the postprocessing property of every rule that only has one token, such that the original postprocessing function will be called with data[0] instead of data. Thus, there is no need for additional rule properties and there is no runtime overhead compared to manually calling id.

In more detail, if an affected rule had the postprocess property fn, we will replace it with id._auto(fn) where

id._auto = fn => {
  return fn
    ? (d, l, r) => fn(id(d), l, r)
    : id;
};

I have decided to attach this helper function to id to avoid any naming collisions with custom preprocessors of existing grammars. The name also makes some sense since it's an automatic application of the id function.

TODOs

There are still a few things left to do. I will address those if this feature is wanted by the maintainers:


I hope you think of this as a useful addition. I really liked writing a parser using nearley and this feature would resolve the only thing that bothered me when using it.

raphinesse commented 4 years ago

Seems the CI is failing since the docs have not been built after the release of v2.19.1