sgl-project / sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Apache License 2.0
2.75k stars 177 forks source link

CUDA error: device-side assert triggered in self.forward_extend_multi_modal(batch) #528

Open LetheRiver0 opened 2 weeks ago

LetheRiver0 commented 2 weeks ago

I have successfully started up backend for llava-v1.6-34b using similar code shown in https://github.com/sgl-project/sglang/blob/f6dbd24043b8c18d87a14b3c6fe5c4f567f6c1ba/examples/quick_start/srt_example_llava.py Everything is ok when I run single or batch in one process. However, if I use parallel, like simultaneous run 2 batch function, There is a certain probability get error in backend, it doesn't appear every time, the full error message is shown below image

I found some similar error in other issues, but it seems is different from my issue, dose anyone meet the same issue and know how to fix it? Thanks~

LetheRiver0 commented 2 weeks ago

Does anyone know how to handle this?