faster production matching while earley parsing

What

while working on nullable rules (#113), profiling showed a potential for performance improvement.

ProductionMatching owns matching productions with terms. currently, it is implemented:

pub(crate) struct ProductionMatching<'gram> {
    pub prod_id: ProductionId,
    pub lhs: &'gram crate::Term,
    /// "right hand side" [`TermMatching`]s which are partitioned by the matched and unmatched.
    /// For example: [Matched, Matched, Matched, Unmatched, Unmatched]
    rhs: Vec<TermMatching<'gram>>,
    /// The progress cursor used to separate [`TermMatching`]s in the "right hand side"
    matched_count: usize,
}

The rhs vector requires a fair amount of dynamic memory alloc and free while parsing (current location where vector is cloned).

There may be a better way to model matching terms.

Brainstorm

If you pick this up, do not take this brainstorm as "the only way". This is just one of many possibilities.

Trie

instead of cloning the matching vector, matching could be modeled as a Trie/Stemming tree, with each node a Term

eg. <word> ::= 'B' 'A' 'D' | 'B' 'A' 'T'

graph TD
    S[start matching prod] --> B
    B --> A
    A --> T
    A --> D

Then each ProductionMatching references a single node in this matching tree.

Results

To check results, the criterion benchmarking can be used:

cargo criterion

Tracing can be used to investigate:

CARGO_PROFILE_BENCH_DEBUG=true cargo flamegraph --bench bnf -- --bench

shnewto / bnf