vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/transformers #32577

Gemma2 GGUF: `modeling_gguf_pytorch_utils.py: ValueError: Ar…

### System Info - `transformers` version: 4.44.0 - Platform: Linux-6.5.0-44-generic-x86_64-with-glibc2.35 - Python version: 3.10.12 - Huggingface_hub version: 0.24.5 - Safetensors version: 0.4.…

alllexx88 updated 4 days ago
6
liucongg/ChatGLM-Finetuning #146

TypeError

我在使用vllm0.5.0测试chatglm-6b的性能，一开始运行vllm时出现“AttributeError: ‘ChatGLMTokenizer‘ object has no attribute ‘tokenizer‘”的错误，把chatglm-6b中的tokenization_chatglm.py替换了之后运行vllm的benchmark_throughput.py时又出现如下错误，请问怎…

ZCzzzzzz updated 3 weeks ago
1
xorbitsai/inference #1925

vllm 不支持batching inference

### System Info / 系統信息 xinference v0.13.2 其中vllm并不支持batching inference。使用openai的batching prompts就会报500错误。为什么不参考 https://github.com/vllm-project/vllm/blob/main/vllm/entrypoints/openai/serving_…

bstr9 updated 1 month ago
2
uclaml/SPPO #5

Which version of vllm should be installed

Hi, when I follow the default steps to set up environment: pip install vllm it will automaticly install vllm 0.5.0.post1, and transformers>=4.40.0 is required. When installing SPPO ( transformer…

xinghuang2050 updated 2 months ago
4
vllm-project/vllm #7697

[RFC]: Enable Memory Tiering for vLLM

### Motivation. Nowadays, many new applications including multi-turn conversations, multi-modality and multi-agent, require a significant amount of KV cache. Such applications generally have a shared…

PanJason updated 2 weeks ago
5
unslothai/unsloth #871

Merging to 16bit for vLLM produces lower performance

I have finetuned the model, now trying to inference the results with vLLM. But, the results are so bad. Any idea why is that.

vjagannath786 updated 1 month ago
1
langchain-ai/langchain #24078

LLaVA model error in VLLM through Langchain

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain documentation with the integrated search. - [X] I used the GitHub search to find a sim…

tsantra updated 1 month ago
2
Zefan-Cai/PyramidKV #14

Merge into vLLM, is it possible?

Solid idea and Ingenious code implementations, Great work! Have you considered implementing KV Compression operations on KV Cache in the vLLM framework?

PatchouliTIS updated 1 month ago
4
eosphoros-ai/DB-GPT #1827

[Bug]vllm error，ValueError: mutable default <class 'list'> f…

### Search before asking - [X] I had searched in the [issues](https://github.com/eosphoros-ai/DB-GPT/issues?q=is%3Aissue) and found no similar issues. ### Operating system information Linux ### P…

huaji23 updated 3 days ago
5
vllm-project/vllm #8158

[Bug]: vllm async engine can not use adag

### Your current environment PyTorch version: 2.4.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.2 LTS (x86_64) GCC version: (U…

Bye-legumes updated 3 days ago
1

上一页 1...11 12 13 14 15 16 17...100 下一页

1000+ results for vllm

1000+ results
for vllm