vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ModelTC/lightllm #84

是否有计划支持 lora？

Same as https://github.com/vllm-project/vllm/issues/182#issuecomment-1627176207

Haskely updated 10 months ago
3
Lightning-AI/litgpt #224

Faster inference, Paged attention from vllm

I'm having great qualitative results from Falcon finetuned with adaptersv2. The inference is better than what I have with huggingface/peft and lora, but still slow for scaling up. Could the idea…

AngainorDev updated 8 months ago
2
mudler/LocalAI #3670

Support multimodals models with vLLM

**Is your feature request related to a problem? Please describe.** Many models are now becoming multi-model, that is they can accept images, videos or audio during inference. The llama.cpp projec…

mudler updated 5 hours ago
3
allenai/OLMo-Eval #20

Load models with vllm where possible

### 🚀 The feature, motivation and pitch _No response_ ### Alternatives _No response_ ### Additional context _No response_

AkshitaB updated 9 months ago
1
unslothai/unsloth #943

model loading failed: RuntimeError: Error(s) in loading stat…

I used the following code to sft llama3: ``` import os import wandb os.environ["WANDB_PROJECT"] = "unsloth-mimic-20240814" # name your W&B project os.environ["WANDB_LOG_MODEL"] = "checkpoint" …

BoyangGu1 updated 3 weeks ago
5
InternLM/lmdeploy #1449

[Bug] PyTorch Engine poor performance compared to vllm

### Checklist - [x] 1. I have searched related issues but cannot get the expected help. - [x] 2. The bug has not been fixed in the latest version. ### Describe the bug I tried to benchmark t…

jjjjohnson updated 5 months ago
3
vllm-project/vllm #8706

[Usage]: Is there any difference between max_tokens and max_…

### Your current environment ```text vllm=0.5.4 ``` llm = LLM( model=MODEL_NAME, trust_remote_code=True, gpu_memory_utilization=0.5, max_model_len=2048, tensor_paralle…

DankoZhang updated 1 week ago
5
vllm-project/vllm #4070

[Bug]: sending request using response_format json twice brea…

### Your current environment ```text Collecting environment information... PyTorch version: 2.1.2+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

samos123 updated 1 week ago
12
Marker-Inc-Korea/AutoRAG #479

vllm does not remove VRAM cache at multi-gpu environment

[data.zip](https://github.com/user-attachments/files/15604248/data.zip) This is the zip file that happened this error.

vkehfdl1 updated 3 months ago
2
ZJUICI/vllm #3

[Bug]: Background loop is stopped

### Your current environment The output of `python collect_env.py` ```text Your output of `python collect_env.py` here ``` ### 🐛 Describe the bug ```console INFO: 10.244.239.34:38…

edwardzjl updated 1 month ago
3

上一页 1...91 92 93 94 95 96 97...100 下一页

1000+ results for vllm

1000+ results
for vllm