dottxt-ai / outlines

Structured Text Generation
https://dottxt-ai.github.io/outlines/
Apache License 2.0
8.28k stars 424 forks source link

slow regex generation #1167

Closed ahmed-moubtahij closed 5 hours ago

ahmed-moubtahij commented 5 hours ago
          Coming from pure `pip install outlines` (it didn't prompt me to install anything else) it took 1hr+ to generate 512 tokens constrained to a `r"```latex(.*?|\n)```"` regex. The FSM compiled to 100% fairly fast though. This was a 2B-4bit model on a 3090, of which all 24GB of VRAM were filled during generation (my prompt is like 20 tokens).

I had a similar experience in the past, which I "solved" by using HuggingFace's TGI for structured generation. It was a lot faster, which is weird because I thought they used Outlines under the hood.

Should I have went with an inference engine like pip install outlines[vllm]?

Originally posted by @ahmed-moubtahij in https://github.com/dottxt-ai/outlines/issues/1149#issuecomment-2354062820

ahmed-moubtahij commented 5 hours ago

https://github.com/dottxt-ai/outlines/issues/1149#issuecomment-2354178194