Speed comparison with other frameworks

There hasn't been official benchmarks yet as far as I know. However, it's hard to do apples-to-apples comparisons, for several reasons: 1 - lm-format-enforcer is the only one that supports batching, so the comparison could only be in a batch-less scenario. 2 - The libraries have different strategies, and thus different caching mechanisms. Do you use the same regex for all calls, or a different regex per call? The performance ratio could change between libraries based on this. 3 - Have any of the libraries taken heuristic inaccuracies in their format support to improve performance? Etc.

So, I will do my best to list a few advantages of the difference libraries, performance wise.

guidance will probably be the slowest of the bunch, as it calls the LLM's generate() function for each token, resulting in a lot of overhead.
In a no-batching, no-beaming, every request is a different regex scenario, outlines will probably be a bit faster, due to them putting in work on JITing their code (there are a few similarities between lm-format-enforcer and outlines - both of them use interegular in order to translate a regular expression to a state machine).
In other scenarios (batching / beaming / one regex for many queries), lm-format-enforcer will be faster (or at least as fast) as outlines, as its caching mechanism + smart integration into generation pipelines effectively cause its performance cost to be one dictionary lookup per token for most timesteps.

noamgat / lm-format-enforcer

Speed comparison with other frameworks #24