Open jerin-scalers-ai opened 1 month ago
please use gpu, cpu is not supported yet
@youkaichao does gemma2 working with 8k context length?
@youkaichao is this model specific or embeddings won't work with cpu?
https://github.com/vllm-project/vllm/issues/9379
@jerin-scalers-ai can you able to load mistral embedding (intfloat/e5-mistral-7b-instruct) model on cpu?
Your current environment
🐛 Describe the bug
vLLM v0.6.0 (cpu) is throwing below error on loading Gemma2 model.
Run vLLM:
Before submitting a new issue...