vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #6134

[Installation]: Couldn't find CUDA library root.

### Your current environment ```text The output of `python collect_env.py` ``` ### How you are installing vllm I install vLLM using Souce code. ```python pip install -e . ``` but encounter…

CodexDive updated 2 months ago
3
vllm-project/vllm #4553

[Bug]: AssertionError in neuron_model_runner.py assert len(b…

### Your current environment PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC version: (U…

calvintwr updated 2 weeks ago
2
vllm-project/vllm #6465

[Bug]: failed when run Qwen2-54B-A14B-GPTQ-Int4(MOE)

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

weiminw updated 2 days ago
10
vllm-project/vllm #5636

Cohere model output speed about 5 tokens per second on vllm.

### Your current environment Collecting environment information... PyTorch version: 2.2.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubunt…

Rares9999 updated 1 week ago
2
vllm-project/vllm #6494

[Bug]: The metrics have not improved.

### Your current environment VLLM is 0.5.0，A100 ， CUDA 12.1 ### 🐛 Describe the bug 1、 CUDA_VISIBLE_DEVICES=1 python -m vllm.entrypoints.openai.api_server \ --model /home/Qwen1.5-1.8B-Chat \ …

zjjznw123 updated 1 month ago
2
NVIDIA/TensorRT-LLM #839

Need much more gpu mem vs vllm when large max batch size.

# ENV ``` GPU: 2080Ti * 4（12G mem *4） Mem：128G CUDA: 12.2 Pytorch：2.1.0 Transformers：4.31.0 TensorRT：9.1.0.post12.dev4 TensorRT-LLM：0.5.0 Triton-trt-llm-backend：0.5.0 Triton：23.10 VLLM：0.2.…

zhaocc1106 updated 7 months ago
1
vllm-project/vllm #5901

[Bug]: TRACKING ISSUE: `AsyncEngineDeadError`

### Your current environment ```text The output of `python collect_env.py` ``` ### 🐛 Describe the bug Recently, we have seen reports of `AsyncEngineDeadError`, including: - [ ] #5060 …

robertgshaw2-neuralmagic updated 6 days ago
15
vllm-project/vllm #5266

[Usage]:how to get the output embedding for a text generatio…

### Your current environment Referring to the issue #5181 "The Offline Inference Embedding Example Fails", the method LLM.encode() can only work for embedding models. Is there any idea to get the ou…

Apricot1225 updated 1 month ago
6
city96/ComfyUI_ExtraModels #56

[Feature Request] Support for Lumina

https://github.com/Alpha-VLLM/Lumina-T2X This looks like a promising variation on Text to Anything. It'd be nice to get support for it as at the moment it's just gradio demos or python code.

GavChap updated 1 month ago
6
lm-sys/FastChat #3271

[Feature Request] Support for Huggingface Chat Templates

Now that many newer Huggingface models come with a chat template in their tokenizer, FastChat should use it as the primary way to build conversations, falling back to `conversation.py` when a template…

PyroGenesis updated 4 months ago
2

上一页 1...86 87 88 89 90 91 92...100 下一页

1000+ results for vllm

1000+ results
for vllm