bentoml / OpenLLM

Run any open-source LLMs, such as Llama, Gemma, as OpenAI compatible API endpoint in the cloud.
https://bentoml.com
Apache License 2.0
10.05k stars 636 forks source link

bug: Cannot Download Model Files with Extension '.bin' #481

Closed ryan-minato closed 1 year ago

ryan-minato commented 1 year ago

Describe the bug

When attempting to use the OpenLLM to run a finetuned model, it fails to download any files with a '.bin' extension (usually model weights).

This issue results in missing files, causing errors when trying to use the fine-tuned weights.

The problem persists across different configurations, including changing backend, OS (from Windows to WSL2 to Linux), using pipx, pip and conda to install, and using quantized or non-quantized models.

To reproduce

Run the following command

openllm start llama --model-id ziqingyang/chinese-alpaca-2-7b --backend pt

Logs

File "user\.local\pipx\venvs\openllm\lib\site-packages\transformers\modeling_utils.py", line 487, in load_state_dict
    with open(checkpoint_file) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'user\\bentoml\\models\\pt-ziqingyang--chinese-alpaca-2-7b\\xxxxxxxxxx\\pytorch_model-00001-of-00002.bin'

[ERROR] [runner:llm-llama-runner:1] Application startup failed. Exiting.

Environment

openllm, 0.3.6 (compiled: no) Python (CPython) 3.9.16

System information (Optional)

No response

aarnphm commented 1 year ago

For models that have legacy serialisation, you need to add --serialisation legacy

ryan-minato commented 1 year ago

For models that have legacy serialisation, you need to add --serialisation legacy

Thank you for your response. It worked.