Closed nivibilla closed 8 months ago
This is caused by flashinfer, but I'm not sure why.
You can remove --model-mode flashinfer
first as a workaround.
Interestingly this works as is with flashinfer with the Mistral 7b version.
But yeah thanks for the tip!
I have the same error with flashinfer too, llava1.5 is fine with flashinfer. Sounds like there is a bug specific with llava1.6 model loading with flashinfer.
Might be specific to the 34b yi model. Because 1.6 Mistral works fine for me even with flashinfer
Using 8xA10s
!python -m sglang.launch_server --model-path /local_disk0/dillonlaird/hf-llava-v1.6-34b --host 0.0.0.0 --port 1234 --tp 8 --model-mode flashinfer
Trace