Closed ssmi153 closed 1 month ago
@ssmi153 Phi3-small requires an optional dependency due to the blocksparse attention. You need to add scipy to your image for it to work. This is intended behavior.
Thanks @mgoin . Iām using the official vllm OpenAI docker container for this. The other option rather than removing the scipy dependency is just to add scipy to this docker container.
@ssmi153 here is what I've done to install the scipy
dependency in the Docker image:
git clone git@github.com:vllm-project/vllm.git
cd vllm
patch Dockerfile << EOF
202c202
< pip install accelerate hf_transfer 'modelscope!=1.15.0'
---
> pip install accelerate hf_transfer 'modelscope!=1.15.0' scipy
EOF
sudo docker build -t vllm_scipy .
This applies a patch to the Dockerfile, by adding the scipy
dependency in the last build stage. It then builds a new image with the dependency (it takes a while).
You can then run the new docker image vllm_scipy
and you'll be able to load the model successfully.
Thanks for the workaround @atineoSE, and thanks to @mgoin for implementing a fix.
Your current environment
Running vllm openai docker container on a single A5000 GPU on Runpod.
Initialisation settings:
--host 0.0.0.0 --model microsoft/Phi-3-small-8k-instruct --tensor-parallel-size 1 --max-model-len 8192 --trust-remote-code
š Describe the bug
Error on launch when running release 0.5.1 and trying to run Phi-3-small-8k-instruct. This is a new error in this release and wasn't an issue in v0.5.0.post1. Other models seem to work fine (tested on Mistral and llama).