runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 81 forks source link

Cannot load Tokenizers for some Models. #63

Closed Mr-Nobody1 closed 2 months ago

Mr-Nobody1 commented 4 months ago

I am getting error saying it cannot load Tokenizers for some models like Yarn Mistral/Llama-2 models. Is there any reason why?

alpayariyak commented 3 months ago

Please try this in vLLM first, and if it does work there, but not with the worker, let us know

alpayariyak commented 2 months ago

Closing for now, let me know if you face this again