issues
search
runpod-workers
/
worker-vllm
The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
195
stars
65
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
A new version of VLLM has been released
#84
d4rk6un
opened
1 week ago
1
Issue: Update VLLM to Version .5.0++, and a few suggestions
#83
nerdylive123
opened
2 weeks ago
4
Allow any vLLM engine args as env vars, refactor
#82
alpayariyak
opened
2 weeks ago
1
Gemma-2 is not available in this docker image.
#81
codingchild2424
opened
2 weeks ago
0
Update to vllm 0.5
#80
Sapessii
opened
2 weeks ago
0
Using mistral 0.3
#79
Sapessii
opened
2 weeks ago
0
OOM on second request
#78
Permafacture
closed
3 weeks ago
2
Add RoPE config to support that
#77
nerdylive123
closed
2 weeks ago
1
Slow streaming
#76
motorbike158
closed
1 month ago
1
Incorrect path_or_model_id
#75
Sapessii
closed
1 month ago
13
Only generates 16 tokens
#74
lawrenceztang
closed
1 month ago
1
ImportError prepare_hf_model_weights method
#73
ArtyoMKos
closed
1 month ago
2
Got some deprecation notice, might update these
#72
nerdylive123
closed
1 month ago
1
Building Docker with model built in
#71
KDercksen
closed
1 month ago
6
GGUF compatibility
#70
adam-clarey
closed
2 months ago
2
Update documentation to note support for extra parameters
#69
bryankruman
opened
2 months ago
1
Runpod serverless vLLM with Llama 3 70B on 40GB GPU
#68
EdwardTheLegend
closed
1 month ago
8
Fixed MODEL_REVISION environment variable
#67
mikljohansson
closed
2 months ago
0
How can i update to vLLM v0.4.1 for llama3 support ?
#66
Lhemamou
closed
1 month ago
6
BadRequestError on runsync route, or what is the correct method to hit handler.py's locally run API?
#65
dpkirchner
closed
2 months ago
1
Best way to record data
#64
aodhan-domhnaill
closed
2 months ago
1
Cannot load Tokenizers for some Models.
#63
Mr-Nobody1
closed
1 month ago
2
vLLM 0.3.3 -> 0.4.0 -> 0.4.2
#62
alpayariyak
closed
2 months ago
3
Serverless generator can not handle errors properly
#61
dendik
closed
3 months ago
0
Multi-LoRA
#60
joaomsimoes
opened
3 months ago
0
chore: update `REVISION` to `MODEL_REVISION` in dockerfile
#59
joennlae
opened
3 months ago
0
OpenAI Error: Not returning full output
#58
Mr-Nobody1
closed
4 months ago
1
OpenAI API: API errors have wrong HTTP code
#57
lucasavila00
opened
4 months ago
3
0.3.2
#56
alpayariyak
closed
4 months ago
0
weird output when using a custom model and ChatAPI does not work
#55
Mr-Nobody1
closed
4 months ago
7
Support for GPT3 based models
#54
letajmal
closed
4 months ago
1
MODEL_REVISION not read
#53
Sapessii
closed
4 months ago
2
[WIP] Testing Suite
#52
alpayariyak
opened
4 months ago
0
Do the new images work?
#51
dannysemi
closed
4 months ago
13
feat: auto build cuda version
#50
justinmerrell
closed
4 months ago
0
Cannot run Mixtral 8x7B Instruct AWQ
#49
ddemillard
closed
4 months ago
6
Fixes import statement
#48
rachfop
closed
4 months ago
0
v0.3.0: OpenAI Compatibility, Dynamic Stream Batching, Refactor, Error Responses, more
#47
alpayariyak
closed
4 months ago
0
Huggingface is down and my worker is looping
#46
dannysemi
closed
4 months ago
1
fix: build error if no `TOKENIZER_NAME` provided
#45
willsamu
closed
5 months ago
1
Update runpod package version
#44
github-actions[bot]
closed
4 months ago
0
trust_remote_code not recognized
#43
dannysemi
closed
5 months ago
1
Error after tokenizer commit
#42
StableFluffy
closed
5 months ago
1
Support for mistralai/Mixtral-8x7B-Instruct-v0.1
#41
ilkersigirci
closed
5 months ago
1
enforce_eager flag
#40
dannysemi
closed
5 months ago
5
Download additional files for build
#39
casper-hansen
closed
5 months ago
0
Update runpod package version
#38
github-actions[bot]
closed
5 months ago
0
Docker image is taking too much time to build
#37
hiennef
closed
5 months ago
7
`MAX_CONCURRENCY` parameter doesn't work
#36
antonioglass
closed
6 months ago
1
feat: add `max_model_length` setup key
#35
willsamu
closed
6 months ago
2
Next