Closed jasonkim5672 closed 6 days ago
Why you set device
to cpu
since you have cuda device on your machine.
Why you set
device
tocpu
since you have cuda device on your machine.
@jikunshang I tried with --device cuda , but it throws error about CUDA(mem 4GB) OOM and process start failed. that's why i'm trying cpu instead. could it be the reason ?
if you choose to use cpu, you need install vllm-cpu binary instead. default binary doesn't support run on CPU.
Your current environment
on WSL CPU : 11th Gen Intel(R) Core(TM) i7-11600H @ 2.90GHz 2.92 GHz RAM : 32GB
How would you like to use vllm
Hi guys, i just stuck in problem when api call . (v0.6.3.post1)
I ran process as below :
The process starts successfully, and api (/v1/models) works well too. But the problem is when i call api (/v1/chat/completion) , it throw error and process died. as below.
I think chat-template or something about parser are one of reason that The value of block_table is None .
Is there anyone in same problem? or am i doing something wrong ?
Before submitting a new issue...