-
### Your current environment
python 3.8
L20*4
vllm 0.5.4
### Model Input Dumps
_No response_
### 🐛 Describe the bug
$python -m vllm.entrypoints.api_server --model='/mntfn/yanyi/Qwen2-…
-
Hi,
after building vllm from source, the following error occures when running a multi-gpu inference using a local ray instance:
```
File "vllm/vllm/model_executor/layers/quantization/awq.py", lin…
-
### Your current environment
The output of `python collect_env.py`.
```text
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyT…
-
### Your current environment
The output of `python collect_env.py`
```text
# For security purposes, please feel free to check the contents of collect_env.py before running it.
python collect_e…
-
Get implementation steps for Issue Ops command like `/ai` to read comments of an issue or PR and reply with similar functionality of ActionAgents to answer with LLM.
-
### Your current environment
```text
The output of `python collect_env.py`
```
Collecting environment information...
WARNING 07-08 14:14:25 _custom_ops.py:14] Failed to import from vllm._C with M…
tytcc updated
1 month ago
-
We need create unitest for done ticket. For now we will use [Gtest](https://github.com/google/googletest) to write unitest
Unitest can run locally and add to CI pipeline
When building debug mode wil…
-
would be good to support dify api to handle all the LLM Ops and RAG.
from: https://docs.dify.ai/
Dify is an open-source large language model (LLM) application development platform. It combines t…
-
### Description of the bug:
I downloaded the `microsoft/Phi-3.5-mini-instruct` from Hugging Face and ran the [convert_phi3_to_tflite.py](https://github.com/google-ai-edge/ai-edge-torch/blob/main/ai_…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.2.1+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS…