Closed yananchen1989 closed 3 months ago
Did you pull latest or is that the 0.4.2 tag checkpoint? The vllm 0.4.2 build doesn't work with Mistral-7B-Instruct-v0.3. See PR https://github.com/vllm-project/vllm/pull/5005 (hash 91977095)
I pulled the latest (hash 8e192ff9) and built vLLM. It works fine with Mistral-7B-Instruct-v0.3. Here are the steps I used to create and run a docker image:
# Download source and pin to hash
git clone https://github.com/vllm-project/vllm.git
cd vllm
git checkout 8e192ff9
# Build Docker container
DOCKER_BUILDKIT=1 docker build . -f Dockerfile --target vllm-openai --tag vllm-src
# Run vLLM
docker run -d --gpus all \
-v $PWD/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=xyz" \
-p 8008:8000 \
--restart unless-stopped \
--name vllm \
vllm-src \
--host 0.0.0.0 \
--model=mistralai/Mistral-7B-Instruct-v0.3 \
--gpu-memory-utilization 0.95
Yes this is fixed on current main and will be part of upcoming release this week
tested, success. thanks.
in the era where the speed is the ultimate currency I do love how fast free open source community is is passing the multi trillion dollar organizations by running so fast, cheers to all contributors
Your current environment
vllm version: 0.4.2
error message:
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.