-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how …
-
I am seeing strange behavior when using Solara to embed Jdaviz into a front-end app. When the button to popout Jdaviz into a browser window is clicked, the application fails to display. Instead the …
-
### Your current environment
vllm 0.4.3
### 🐛 Describe the bug
多线程测试完vllm部署的模型服务报错如下 模型为qwen2-72b-int4-gptq
RROR: Exception in ASGI application
0|startvllm72b | Traceback (most recent …
-
The server returned 500 error when I tried to test the API call to my local server.
The `uvicorn` package return: `NameError: name 'request_queue' is not defined. Did you mean: 'request_store'?`
…
-
### Your current environment
The output of `python collect_env.py`
```text
root@1ea91861f035:/vllm-workspace# python3 collect_env.py
Collecting environment information...
PyTorch version: 2.3…
-
```
backend_1 | Traceback (most recent call last):
backend_1 | File "/home/appuser/.local/lib/python3.8/site-packages/uvicorn/protocols/http/h11_impl.py", line 407, in run_asgi
backend_1 | …
-
**问题描述**
在xinference里我切换了模型为qwen2.5-instruct的7b模型,在WEBUI中可以正常使用,但是通过api访问时返回{'detail': "Only ['qwen1.5-chat', 'qwen1.5-moe-chat', 'qwen2-instruct', 'qwen2-moe-instruct', 'glm4-chat', 'glm4-chat-1m'] …
-
按照 01-GLM-4-9B-chat FastApi 部署调用.md 调用报错
INFO: 127.0.0.1:13068 - "POST / HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "…
-
**Describe the bug**
context_chat_backend/models/__init__.py still contains "instructor" in its embedding models. This causes a 500 error when the backend attempts to load the models
Line 7: _embeddi…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How would you like to use vllm
I want to know more about `--enable-chunked-prefill`. and the constriant betwee…