Engelberg / instaparse

Eclipse Public License 1.0
2.74k stars 148 forks source link

[FR] tree → string emitter #140

Open arrdem opened 8 years ago

arrdem commented 8 years ago

I know there's a ticket open for generating strings which conform to a grammar, but it'd be awesome if there was a way to take a tree (as if parsed from a grammar & transformed remaining legal within that grammar) and a grammar, then generate a (the?) string which would parse the same way.

This would make it super easy to write source to source transformation tools, because you just parse a tree in, manipulate it, emit another tree in the same language and get re-encoding of the new expression tree "for free". Obviously for a whitespace sensitive grammar or a rewriting tool which cares about newlines/whitespace the parsing and manipulation language would have to encode those features as such but for simple expression oriented languages I'd think this is straightforwards.

Engelberg commented 8 years ago

The leaves of the parse tree are strings, so can't you just do a depth-first traversal of the tree and concatenate the strings at the leaves together in order to recover the original string?

I'm pretty sure I've seen someone use this strategy before with instaparse, but I can't quite remember where I've seen it.

mnemion did some related work on this shortly after instaparse's release to do further parse transformations on the tree, and he called his pull request "re-parse". I like the idea but haven't (yet?) integrated it into instaparse.

Engelberg commented 8 years ago

This is the pull request I referred to: https://github.com/Engelberg/instaparse/pull/45