Closed niklub closed 9 months ago
Thanks for reporting this bug. We will look into it soon.
In the meantime, you can probably try
sgl.gen("answer", regex=r"(Y|N)", temperature=0)
which is more robust and can work well for a few short choices.
This bug is fixed by #67
You can try the latest main branch or sglang[all]>=1.6.0
Hello, team!
Thanks for the excellent work. When working batch inference, sometimes encountering server-side error that completely interrupts the process:
It never happened to me with small batch sizes (1-10), but constantly face it with bigger ones.
The code to run batch inference fwiw: