-
### Your current environment
The output of `python collect_env.py`
```Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1…
-
### Your current environment
```text
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 11 (bullseye) (x86…
-
### Your current environment
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubunt…
-
### 🚀 The feature, motivation and pitch
Please consider adding support for GPTQ and AWQ quantized Mixtral models.
I guess that after #4012 it's technically possible.
### Alternatives
_No r…
-
my env is:
os:centos7.9
cpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 20
On-line CPU(s) list: 0-19
Thre…
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.1.2+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…
-
这是我的运行代码:
python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-VL-7B-Instruct --model /home/wangll/llm/model_download_demo/models/Qwen/Qwen2-VL-7B-Instruct
以下是报错信息:
INFO 09-03 1…
-
## Function Calling
- Frontend
- Add `tools` argument in `sgl.gen`. See also guidance [tools](https://github.com/guidance-ai/guidance/blob/d1bbe1c698cbb201f89556d71193993e78c0686b/README.md?plai…
-
`2024-05-24 23:49:38 WARNING 05-24 15:49:38 utils.py:327] Not found nvcc in /usr/local/cuda. Skip cuda version check!
2024-05-24 23:49:38 INFO 05-24 15:49:38 config.py:379] Using fp8 data type to sto…
-
### Your current environment
The output of `python collect_env.py`
```text
python collect_env.py
Collecting environment information...
PyTorch version: 2.4.0+cu121
Is debug build: False
C…