Closed Gintasz closed 4 months ago
If I remove max_tokens=500
, then it seems performance with regex is ~3x faster:
SGLang 0.1.14 | 300 batch items | 50 threads | 371.07 secs | NVIDIA H100 80GB HBM3
Looks like it may be related to outlines
as well because other people reported GPU utilization stays at 0% during formatting:
https://github.com/outlines-dev/outlines/issues/751
I noticed guidance library mentions Regex constraint capability, however, does not include interegular
as a dependency, a library on which outlines
depends for regex constraining, so maybe it could have a faster solution?
Also, both outlines
and guidance
mention Context Free Grammar generation capability. It could be useful to add support for that in this library as well... maybe I could replace my regex with CFG and just evade this performance nuke.
syncode also works on CFGs for LLMs.
This issue has been automatically closed due to inactivity. Please feel free to reopen it if needed.
Reopen
+1
+1
I've been trying to investigate why my information extraction program with SGLang is so slow. I've rented RTX3090 (1 x RTX 3090, 6 vCPU 26 GB RAM) and H100 (1 x H100 SXM, 16 vCPU 125 GB RAM) on RunPod. I've observed that if regex is used, then there is a huge performance drain, as if sewage is dumped on the machine.
Benchmark from test program WITH REGEX ENABLED
During most of the batch generation with H100, RunPod showed
GPU Utilization
0-3%. Suspiciously, on H100 first 50 items got processed very fast, then there appeared to be a hang for a couple minutes, then the other 50 items processed very fast, etc...Benchmark from test program WITHOUT REGEX ENABLED.
If you think the particular regex
"<array>\n(<string>.*?<\/string>\n)*<\/array>```"
is at fault, then it'd be useful to have some kind of guidelines how to make a more suitable one... My requirement here is string array generation.Steps to reproduce:
I've used SGLang 0.1.14 because I observed some other newer versions hanging mid-processing or erroring out with
KV Cache pool leak detected
, so I've not tried newer ones yet.sglang_str_big.json
To disable regex, I just removed this part:
regex=r'<array>\n(<string>.*?<\/string>\n)*<\/array>```'
@merrymercy @hnyls2002