117 and #118 raised examples of failures to parse input based on grammars. after some investigation, it seems the root cause was shared by both issues.

Root Cause

Consider the grammar:

<start> ::= <shortfail> | <longsuccess>
<shortfail> ::= <char> 'never'
<char> ::= 'a'
<longsuccess> ::= <long2>
<long2> ::= <long3>
<long3> ::= <long4>
<long4> ::= <char>

When parsing input "a", there are two routes: "shortfail" and "longsuccess". As the names suggest, "shortfail" requires fewer traversals, but always fails. "longsuccess" requires more traversal steps, and should eventually succeed.

The issue is caused because both paths predict the non-terminal "char". Practical Earley parsing requires de-duplicating predictions or else recursive grammars fall into infinite loops. The existing Earley implementation in BNF does roughly the following:

(work roughly alternates between the short and long routes because new traversals are appended to the end of the work queue)

* <start> ::= • <shortfail>
* <start> ::= • <longsuccess>
* <shortfail> ::= • <char> 'never'
* <longsuccess> ::= • <long2>
* <char> ::= • 'a'
* <long2> ::= • <long3>
* <char> ::= 'a' •
* <long3> ::= • <long4>
* <shortfail> ::= <char> • 'never' // <--- notice this does NOT succeed
* <long4> ::= • <char> // !!! this <char> prediction is IGNORED because an identical prediction was already made

All the <longN> productions are necessary because otherwise the <longsuccess> route is able to predict before its completion.

I am sorry if I have not explained this super clearly. It is a tricky problem! Funny aside, I actually thought of this bug while developing v0.4.0 . But I (wrongly) assumed there was something about the Earley algorithm which resolved it. I attempted to write a test, but I did not realize it would require so many intermediate non-terminals to expose. Woops!

Solution

The underlying problem is that because non-terminals can be shared across multiple traversal routes, "completion" must be addressed also during "prediction".

Performance

On my machine, there was a roughly 10% performance cost to the new parsing logic. I will experiment in the future for improvements, but in the meantime I believe passing tests and correct behavior outweigh the cost.

Extra

Basically every time I have worked on an Earley bug, I have ended up adding the same manual logging to help with debugging. I decided to commit that logging this time!

There is a new tracing::event! which by default is a noop, but with the "tracing" feature enabled adds logging events.

For Earley, traversal state events are logged when created and during predict/scan/complete. It helps quite a bit with debugging!

CrockAgile commented 1 year ago

I plan to keep this PR open so the author of #117 and #118 (@amascolo) can experiment with the fix 🙌

coveralls commented 1 year ago

Coverage: 91.408% (-0.7%) from 92.114% when pulling 02d705bb8cfab57f1c4f411c55fed43bb619c696 on right-recursive-failure into f3ca02101c71636be2d0e7e1605578e63767082c on main.

amascolo commented 1 year ago

Oh something else has been bothering me – @CrockAgile's comment makes me wonder if it's related:

Practical Earley parsing requires de-duplicating predictions or else recursive grammars fall into infinite loops.

I tried writing a simple fuzzing test, which uses bnf to generate random inputs and parse them:

use bnf::Grammar;

#[test]
fn fuzz() {
    let grammar: Grammar = … // the larger grammar I'm writing
    for _ in 0..1_000 {
        // TODO it's not clear why generation sometimes fails
        if let Ok(input) = grammar.generate() {
            assert!(grammar.parse_input(&input).next().is_some());
        }
    }
}

However, generating inputs sometime fails:

let input = grammar.generate().unwrap();

If I switch to the line above, it hits a recursion error:

RecursionLimit("Limit for recursion reached processing <rule>!")

What I'm wondering is:

Are there any reasons why a call to Grammar::generate() should ever fail?
Is there a connection between generation hitting a recursion limit and parsing failing?
Would fuzzing tests (that ensure generation always succeeds) help catch these issues?

This might well be unrelated – just thought it was worth raising here!

shnewto commented 1 year ago

@amascolo wrt the Grammar::generate() comment, good spot! that's def something that needs some effort, though I do feel it's unrelated to the work tracked in this PR. For context it's a feature that was added quite some time ago with the main goal being, I wanted to generate poems and have a fuzz/property tests that generated bnfs for this crate to parse 😄 we're tracking it in #70 and in our own tests we're basically swallowing the recursion limit errors.

https://github.com/shnewto/bnf/blob/f3ca02101c71636be2d0e7e1605578e63767082c/tests/grammar.rs#L29-L35

The "why" of the error it is that the current implementation just naively traverses the grammar recursively, while monitoring the stack, if we make it near the stack limit, we exit in an error instead of a panic.

shnewto / bnf

Complete prior Earley predictions #119

117 and #118 raised examples of failures to parse input based on grammars. after some investigation, it seems the root cause was shared by both issues.

Root Cause

Solution

Performance

Extra