runpod-workers worker-vllm issues

runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

MIT License

220 stars 85 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

[Update] Docs, bug fix.

#109 pandyamarut closed 1 week ago
0
[Bug]fix oai completion api error

#108 pandyamarut closed 1 week ago
0
Update README.md

#107 pandyamarut closed 2 weeks ago
0
update vllm version 0.5.5

#106 pandyamarut closed 2 weeks ago
0
0.5.5 is out

#105 the-xentropy opened 2 weeks ago
1
'NoneType' object has no attribute 'headers' (completions endpoint)

#104 Permafacture opened 3 weeks ago
8
Correct way to implement RAG with vllm

#103 Hel1zor opened 3 weeks ago
0
Add human readable worker-config.json

#102 carlson-svg closed 2 weeks ago
1
Documentation incorrect regarding boolean

#101 scriptcoded opened 3 weeks ago
0
MODEL_REVISION & TOKENIZER_REVISION: Both are needed to configure the revision

#100 TimPietrusky opened 3 weeks ago
0
Bitsandbytes support

#99 ilyalasy opened 3 weeks ago
0
Support GGUF models

#98 vladfaust opened 1 month ago
0
Meta-Llama-3.1-8B support

#97 klipach opened 1 month ago
2
update vllm version 0.5.4

#96 pandyamarut closed 1 month ago
0
Update runpod package version

#95 github-actions[bot] closed 1 month ago
0
feat: align version with vllm

#94 wwydmanski opened 1 month ago
3
Update README.md

#93 pandyamarut closed 1 month ago
0
Optimize Dockerfile with UV for faster dependency installation

#92 rachfop opened 1 month ago
0
trust_remote_code Setting Not Applied in runpod/worker-v1-vllm:stable-cuda12.1.0

#91 Juhong-Namgung closed 4 weeks ago
5
Update README.md

#90 pandyamarut closed 1 month ago
0
ValueError: rope_scaling must be a dictionary with two fields, type and factor

#89 omar93939 opened 1 month ago
2
Unable to deploy mistralai/Mistral-Nemo-Instruct-2407

#88 TheMindExpansionNetwork opened 1 month ago
5
[feat] ability to set max_num_seqs

#87 kalocide opened 1 month ago
1
Fix runtime error for dict chat templates

#86 nkruglikov closed 1 month ago
1
Support for tools / tool_choice="auto" in OpenAI-compatible API

#85 TimPietrusky opened 1 month ago
23
A new version of VLLM has been released

#84 d4rk6un opened 2 months ago
1
Issue: Update VLLM to Version .5.0++, and a few suggestions

#83 nerdylive123 opened 2 months ago
13
Allow any vLLM engine args as env vars, Update vLLM, refactor

#82 alpayariyak closed 1 month ago
4
Gemma-2 is not available in this docker image.

#81 codingchild2424 opened 2 months ago
0
Update to vllm 0.5

#80 Sapessii closed 1 month ago
0
Using mistral 0.3

#79 Sapessii closed 1 month ago
0
OOM on second request

#78 Permafacture closed 2 months ago
2
Add RoPE config to support that

#77 nerdylive123 closed 2 months ago
1
Slow streaming

#76 motorbike158 closed 3 months ago
1
Incorrect path_or_model_id

#75 Sapessii closed 3 months ago
13
Only generates 16 tokens

#74 lawrenceztang closed 3 months ago
1
ImportError prepare_hf_model_weights method

#73 ArtyoMKos closed 3 months ago
2
Got some deprecation notice, might update these

#72 nerdylive123 closed 3 months ago
1
Building Docker with model built in

#71 KDercksen closed 3 months ago
6
GGUF compatibility

#70 adam-clarey closed 4 months ago
2
Update documentation to note support for extra parameters

#69 bryankruman opened 4 months ago
1
Runpod serverless vLLM with Llama 3 70B on 40GB GPU

#68 EdwardTheLegend closed 3 months ago
8
Fixed MODEL_REVISION environment variable

#67 mikljohansson closed 4 months ago
0
How can i update to vLLM v0.4.1 for llama3 support ?

#66 Lhemamou closed 3 months ago
6
BadRequestError on runsync route, or what is the correct method to hit handler.py's locally run API?

#65 dpkirchner closed 4 months ago
1
Best way to record data

#64 aodhan-domhnaill closed 4 months ago
1
Cannot load Tokenizers for some Models.

#63 Mr-Nobody1 closed 3 months ago
2
vLLM 0.3.3 -> 0.4.0 -> 0.4.2

#62 alpayariyak closed 4 months ago
3
Serverless generator can not handle errors properly

#61 dendik closed 5 months ago
0
Multi-LoRA

#60 joaomsimoes opened 5 months ago
0