Eager token id cache for regular expressions

Dan-wanna-M commented 4 months ago

Currently, only lazy cache is supported. This means the first run of an engine is never accelerated by cache. Combined with grammar rules that accept large numbers of tokens, this could make the engine slow. While I cannot figure out a good eager cache strategy for context free grammars, caching regular expressions(which are already fully compiled into DFAs) should be very easy. Some extra efforts probably are necessary for caching tokens that cross the boundary of regular expressions and ordinary context free grammar rules.

Once this is done, stronger claims can be made for time complexity. What's more, the speed of directly generating context free grammar will not be practically inferior to generating regular expressions.

Dan-wanna-M commented 3 months ago

This probably is not very important right now since all other libraries run at least one magnitude slower than ours.

Dan-wanna-M commented 2 months ago

Implemented in v0.4.0-rust

Dan-wanna-M / kbnf

Eager token id cache for regular expressions #1