vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

EmbeddedLLM/vllm-rocm #21

Unable to load models on RX 6800

On my RX 6800 I seem to get `RuntimeError: FlashAttention only supports AMD MI200 GPUs or newer.` for some reason, I Googled that GPU and it seems to be RDNA2 like mine but for enterprise. Is this not…

nonetrix updated 2 months ago
2
vllm-project/vllm #3813

[Bug]: AssertionError when load miqu70b after full sft

### Your current environment environment pip install auto_gptq modelscope xformers torchvision torchaudio torch==2.1.2 -U pip install datasets huggingface-hub transformers==4.39.1 -U pip install…

uRENu updated 3 months ago
3
modelscope/modelscope-agent #499

调用qwen-max模型，还需要消耗显存吗

### Description 本地部署的agent，调用qwen-max模型。然后每建立一个对话都需要消耗相应的显存吗？多几个对话，显存满了就只能等待？ ### Link _No response_

liutong0127 updated 2 weeks ago
2
instructlab/instructlab #965

Add support for Intel Gaudi 2 accelerator (Habana Labs HPU)

InstructLab 0.13 supports hardware acceleration for Apple Silicon (via `mlx`) and CUDA-like GPUs (NVIDIA CUDA and AMD ROCm via `torch.cuda`). I would like to add support for Intel Gaudi 2 hardware and…

tiran updated 2 months ago
5
BerriAI/litellm #2242

[Feature]: alternate user/assistant message format for async…

### The Feature we support this for completion, need to support for async completion to work for proxy ### Motivation, pitch user faced issue trying to make calls to mixtral on vllm using us ###…

krrishdholakia updated 4 months ago
1
vllm-project/vllm #3520

[Bug]: vllm slows down after a long run

### Your current environment ```text Collecting environment information... /data/miniconda3_new/envs/vllm/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORM…

momomobinx updated 3 months ago
2
THUDM/GLM-4 #274

无法使用vllm

### System Info / 系統信息 ubuntu22 conda python3.11 nvidia-cudnn-cu12 torch 2.3.0 vllm 0.5.0.post1 vllm-flash-attn 2.5.9…

itbithubman updated 1 week ago
6
vllm-project/vllm #3794

[Feature]: Make `outlines` dependency optional

### 🚀 The feature, motivation and pitch I'm using a newer version of `outlines` than v0.0.34, and my application needs the fixes implemented in newer versions of that package. It would be great if …

saattrupdan updated 3 months ago
3
mistralai/platform-docs-public #35

Truncated Response number or date with open-mixtral-8x7b on …

Hello, Could you do something for the open-mixtral-8x7b model to fix the truncation bug on "la plateforme". Here they explain what they did to solve it on a vllm server (with spacing between ` an…

M-kasinski updated 3 months ago
1
vllm-project/vllm #4675

[Feature]: vAttention

### 🚀 The feature, motivation and pitch Claim major improvements over vllm. Unfortunately no code only the paper. arxiv.org/abs/2405.04437 ### Alternatives _No response_ ### Additional context …

nivibilla updated 1 month ago
3

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for vllm

1000+ results
for vllm