pest-parser / pest

The Elegant Parser
https://pest.rs
Apache License 2.0
4.63k stars 259 forks source link

RFC: Implement grammar with macros 1.1. #83

Closed dragostis closed 7 years ago

dragostis commented 7 years ago

There are currently two important issues that hinder the development experience when using pest. The first is the fact that macro expansions (especially the recursive definitions within the grammar! macro) do not give descriptive errors. This is a limitation of the macro system and cannot be amended with the current system.

The second issue is the non-linear growth in compile time with respect to the size of the grammar. This is in part because of the slow expansion of the recursive macros but also the bloated code that it generated which takes quite some time to compile. A ~700 line grammar compiles in ~30s on a newish system while generating over 20K lines of Rust code. (mileage may vary a lot)

The proposed solution would be to take advantage of the current implementation of macros 1.1 which permits an extra line to be added before the struct. This solution was documented by @tomaka and looks something like this:

#[derive(Parser)]
#[grammar("ruby.pest")]
struct MyParser;

The procedural macro would then open the grammar file, parse it and generate the code for the parser. The parsing would be done with a very simple manually-written recursive descent parser and a lexer.

This approach has several important advantages:

The major disadvantage of this approach is the fact that the process! macro would have nowhere to be added to. One idea would be to add it to the .pest grammar file and then pass it to pest to compile. Another approach would be to have it outside:

#[derive(Parser)]
#[grammar("ruby.pest")]
struct MyParser;

impl MyParser {
    process! {
        ...
    }
}

As far as I know, this also means that the grammar! macro would have to form a separate crate which is but a minor disadvantage.

Alternatives to this would be to wait for macros 2.0, but that may be a long wait. Another alternative was to have a separate crate that analyzes the grammar and provides errors and which can also test your grammar dynamically for fast prototyping.

Keats commented 7 years ago

Feel free to ping me if you need a guinea pig!

dragostis commented 7 years ago

@Keats Awesome. I'm working on the procedural branch. I'm currently trying to figure out a data structure that permits efficient parallelization.

dragostis commented 7 years ago

Further discussion lead to the following design choices:

sunjay commented 7 years ago

I'm also happy to be a tester once this is ready!

sunjay commented 7 years ago

I think we should keep the process macro and adopt the approach you suggested above:

#[derive(Parser)]
#[grammar("ruby.pest")]
struct MyParser;

impl MyParser {
    process! {
        ...
    }
}

It's good enough for now and it allows the user to implement further non-matcher methods on the parser as needed. Since process! is mostly Rust code (baring the patterns in the matchers), it makes sense to keep that stuff in Rust source files so the compiler can check them.

dragostis commented 7 years ago

Right now, I'm pretty confident about a Stream implementation that would replace process!.