inhabitedtype / angstrom

Parser combinators built for speed and memory efficiency
Other
625 stars 73 forks source link

Support for mutually recursive parsers #213

Open aryx opened 3 years ago

aryx commented 3 years ago

I see that for recursive parser, like when parsing JSON, you provide the fix combinator. What about mutually recursive parsers, for example if I want to parse Java, there will be a parser for stmt, one for expr, but inside stmt you need to call expr and inside expr you also need to call stmt (for example Java support lambdas expressions, which themselves contain stmt). Parsing a real programming language requires a ton of mutually recursive parsers.

The fastparse parser combinator library https://com-lihaoyi.github.io/fastparse/#GettingStarted allows for example to define those mutually recursive parsers.

BTW fastparse has also special support to skip whitespace by using Implicit. Is there something like that planned for angstrom too? See https://com-lihaoyi.github.io/fastparse/#ExampleParsers and the NoWhitespace special import.

Hakimba commented 2 years ago

Hi, i have the same problem, its possible to have a full example of a toy programming language parsing with angstrom ?

thedufer commented 2 years ago

The fix combinator is convenient, but not actually necessary. You can see in my comment here how you could implement a recursive parser without fix. You can do the same thing to handle mutual recursion. You can also do it a bit more cleanly by using a fix for each parser that needs to refer to itself, taking arguments as stubs for parsers that haven't been defined yet, and then stitching it together at the end (see here).

I think the idea of skipping whitespace doesn't actually make sense in angstrom in the way that you mean. angstrom is, as far as I can tell, a lot lower-level than fastparse, which makes conveniences like that not really possible, but also means it can be a lot faster, and angstrom is targeted primarily at very perf-sensitive applications.