shnewto / bnf

Parse BNF grammar definitions
MIT License
256 stars 22 forks source link

Order of Rules Affects Parsing #137

Open lambdaknight opened 11 months ago

lambdaknight commented 11 months ago

Describe the bug BNF grammar should be independent of the order of the rules as long as it is otherwise well-formed.

To Reproduce Take the dna_left_recursive test and reverse the order of the rules. The BNF parser can no longer successfully parse the input string.

#[test]
fn dna_left_recursive() {
    let grammar: Grammar = "<base> ::= 'A' | 'C' | 'G' | 'T'
        <dna> ::= <base> | <dna> <base>"
        .parse()
        .unwrap();

    let input = "GATTACA";

    let parses: Vec<_> = grammar.parse_input(input).map(|a| a.to_string()).collect();
    assert_snapshot!(parses.join("\n"));
}
CrockAgile commented 11 months ago

Thanks for bringing this up! I am not sure yet, but I have a guess.

The current bnf::Grammar::parse doesn't have a way to designate the "starting" term. So as is, it assumes the first rule begins with the "starting" term.

In this example, this means the "starting" term is <base>, which cannot parse "GATTACA".

Sorry if this turns out to be the reason! I meant to mark this strange implicit assumption in the API documentation, but I must have forgotten.

lambdaknight commented 11 months ago

That's what I figured was happening, but I figured I'd file a bug so you have a nice reminder to add it to the documentation or whatever you end up doing. :) Thanks!