-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
…
-
Great work so far. I'm trying to run vLLM on my 7900XTX cards and was wondering if there were any plans to support RDNA3?
-
(vllm) PS C:\Users\hub1\gen-ai-training-abshek\gen-ai-training\vllm-main> pip install -e .
Looking in indexes: http://art.nwie.net/artifactory/api/pypi/pypi/simple
Obtaining file:///C:/Users/hub1/ge…
-
### Your current environment
I am using vllm version 0.3.0
I am using this class `ChatCompletionRequest` to create the request for my chat completion endpoint
Whenever I set the `max_tokens` to an…
-
Fully supported! Scroll down on our latest Mistral notebook: https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing
For 16bit merging:
```
model.save_pretrained_mer…
-
It is not as useful as qwen1 in real use cases and is not a generic benchmark
Math skills are no good
eg. 去年阿里的营收534785万元,腾讯的是54787万元,这两个公司哪个的营收比较多,高多少
use:vllm 0.3.3
-
### Your current environment
using lastest docker images:
command:
docker run --runtime nvidia --gpus all -v /mnt_1/models:/models -p 8000:8000 --ipc=host vllm/vllm-openai:latest --model /models/Q…
-
We should open a PR on the LangChain repo to add Outlined as a model / guided generation provider.
rlouf updated
2 months ago
-
Hi, I'm curious about Next-DiT, it is not mentioned in your paper.
-
### Your current environment
```text
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
…