runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.
MIT License
213 stars 82 forks source link

fix: build error if no `TOKENIZER_NAME` provided #45

Closed willsamu closed 6 months ago

willsamu commented 6 months ago

Building an image with a model baked in throws this error after downloading all the weights if no argument for TOKENIZER_NAME was provided:

105.2 Traceback (most recent call last):
105.2   File "/download_model.py", line 55, in <module>
105.2     tokenizer_folder = download_extras_or_tokenizer(tokenizer, download_dir, revisions["tokenizer"])
105.2   File "/download_model.py", line 11, in download_extras_or_tokenizer
105.2     folder = snapshot_download(
105.2   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 110, in _inner_fn
105.2     validate_repo_id(arg_value)
105.2   File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 164, in validate_repo_id
105.2     raise HFValidationError(
105.2 huggingface_hub.utils._validators.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: ''.

This happens, as TOKENIZER_NAME is set to default value empty string in Dockerfile, which in turn sets the tokenizer value in line 30 to an empty string as well.

alpayariyak commented 6 months ago

Solves #42