-
Some dependencies (ROCm vLLM, for instance) require a higher version of torch than we're currently using. Should upgrade those dependencies and resolve any conflicts that we currently have pinned.
-
使用vllm推理Qwen2-VL-2B-Instruct-GPTQ-Int4报错,一直提示:
File "/usr/local/venv/model_llm/lib/python3.10/site-packages/vllm/entrypoints/openai/api_server.py", line 130, in build_async_engine_client_from_engine…
-
**Is your feature request related to a problem? Please describe.**
Llama3.2 was released, and as it has multimodal support would be great to have it in LocalAI
**Describe the solution you'd li…
-
按照https://modelscope-agent.readthedocs.io/en/stable/llms/qwen2_tool_calling.html里面的教程,用vllm起了qwen2-7b,提示需要在vllm启动命令加--enable-auto-tool-choice, 但是启动--enable-auto-tool-choice需要配套的--tool-call-parser,目前vl…
-
有什么办法能统计function call调用过程中输入和输出的token数吗,试了一下修改oai.py代码,打印出的chunk里也没有相关信息
root@vllm:~/benchmark# python3 fcall.py
请输入查询内容(输入 'quit' 退出):你好
ChatCompletionChunk(id='chat-a55bee23999c4e37ae2bc50b1bf4…
-
Hi, I tried to eval the Llama-3-Instruct-8B-SimPO-v0.2 checkpoint by arena-hard-auto, and I only got
Llama-3-Instruct-8B-SimPO-v0.2 | score: 35.4 | 95% CI: (-3.2, 2.0) | average #tokens: 530
…
-
This is a question i got from @bnellnm on the vllm-torch slack channel:
We have a CustomClassHolder class ScalarType defined [here](https://github.com/vllm-project/vllm/blob/main/csrc/core/scalar_…
-
I deployed the vllm model locally and tried to connect using the void extension, but I can't initiate a conversation. What should I do?
-
### Is this your first time submitting a feature request?
- [X] I have searched the existing issues, and I could not find an existing issue for this feature
- [X] I am requesting a straightforward…
-
As the title says. we need to embed the image with visual part separately.
For now , i use the code bellow.
Trouble is, I still loaded all the parameters.
Can you guys give me some tips to simpli…