runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 81 forks source link

MODEL_REVISION not read #53

Closed Sapessii closed 6 months ago

Sapessii commented 6 months ago

Hi,

running runpod/worker-vllm:0.3.0-cuda12.1.0 in serverless and setting the env variable MODEL_REVISION it's not read and it keeps downloading the main branch.

Any suggestion? Thank you

Sapessii commented 6 months ago

solved, the env is MODEL_NAME_REVISION, not MODEL_REVISION

alpayariyak commented 6 months ago

Just fixed this to use MODEL_REVISION and TOKENIZER_REVISION, thanks for pointing it out!