pest-parser / pest

The Elegant Parser
https://pest.rs
Apache License 2.0
4.67k stars 261 forks source link

Trivial left recursion not detected in presence of tag #875

Closed jaybosamiya closed 1 year ago

jaybosamiya commented 1 year ago

Describe the bug

Pest is successfully able to detect left recursion when there isn't a tag, but isn't able to detect it once there is a tag.

To Reproduce

Consider the following made up example (i.e., I know there are easier ways to represent the exact same grammar, but this is just to demonstrate the issue with a smaller example):

extern crate alloc;
extern crate pest;
#[macro_use]
extern crate pest_derive;

use pest::Parser;

#[derive(Parser)]
#[grammar_inline = r#"

// x = { x ~ "y" | "z" }       // LINE A
x = { #t=x ~ "y" | "z" }       // LINE B

"#]
struct TestParser;

fn main() {
    let parse_result = TestParser::parse(Rule::x, "zyyy").unwrap();
    dbg!(parse_result);
}

Compile the above code and run it, you will get a thread 'main' has overflowed its stack.

Notice that the only difference between LINE A and LINE B is whether it is tagged.

Now switch out LINE B for LINE A, so that the tagging is not done.

Compile again, and notice that Pest (correctly) reports rule x is left-recursive (x -> x).

Expected behavior

It should recognize the left-recursive behavior even in the case of it being tagged.

Additional context

The original situation that triggered the runtime failure was a ~1000 LoC grammar file, but the details for that seem irrelevant when there is a much shorter and easier-to-understand case, such as what I've minimized it to above.

Sidenote: having this sort of bad recursion seems to have an extra code-smell: it requires the extern crate alloc; for some reason. Not entirely sure why.

tomtau commented 1 year ago

@Tartasprint previously tried to find different edge cases: https://github.com/pest-parser/pest/pull/848 FYI

tomtau commented 1 year ago

@jaybosamiya I'm unable to reproduce it, I get this panic as expected:

 --> 1:10
  |
1 | x = { #t=x ~ "y" | "z" }
  |          ^
  |
  = rule x is left-recursive (x -> x); pest::pratt_parser might be useful in this case'

I'm guessing it may be because you didn't enable the "grammar-extras" feature in https://github.com/jaybosamiya/verus-pest/blob/main/Cargo.toml#L11 ? See the warning in the latest release: https://github.com/pest-parser/pest/releases/tag/v2.7.0

jaybosamiya commented 1 year ago

Ah indeed, I was using it without that feature because I was on a version prior to latest release (2.6.0, which appears to have since been yanked), and didn't need grammar-extras. Also funnily, my bug was filed almost exactly 2 hours before the 2.7.0 release 🙃 Will switch over to using that feature now.

I had ended up working around it by removing all uses of the#t=x tag, but looks like I can start using it again, which is good news.

Thanks for taking a look at it :) And thanks for letting me know it is working well now!