sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sgl-project.github.io/
Apache License 2.0
6.22k stars 532 forks source link

[Feature] How to accelerate constrained decoding when regex needs to change with input? #2168

Open GrittyChen opened 21 hours ago

GrittyChen commented 21 hours ago

Checklist

Motivation

In some practical application scenarios, the regex needs to change with the input, and the speed of constrained decoding using Compressed FSM will be significantly slower than that of unconstrained decoding due to each time you need to compile. How do we support the constrained decoding acceleration requirement in unfixed regex scenarios? thanks~

Related resources

No response