huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.99k stars 1.06k forks source link

Failing to unpickle the model #2464

Open ksajan opened 2 months ago

ksajan commented 2 months ago

System Info

cargo version cargo 1.80.1 (376290515 2024-07-16) Haven't been able to run the docker file to get more details.. I am trying to run the docker on CPU

Information

Tasks

Reproduction

I ran the command

model=lmsys/vicuna-7b-v1.3
volume=$PWD/data

docker run --rm --privileged -e HF_HUB_ENABLE_HF_TRANSFER="false"\
    --ipc=host --shm-size 1g -p 8080:80 -v $volume:/data \
    ghcr.io/huggingface/text-generation-inference:latest-intel-cpu \
    --model-id $model --cuda-graphs 0

Expected behavior

I expected it to download the image and run the docker to train a model I was trying to.

ErikKaum commented 2 months ago

Hi @ksajan!

Could you provide a bit more context on the error that you're seeing from this?

ksajan commented 2 months ago

@ErikKaum Give me some time, I have deleted the folder and was testing other parts of it. So will take a little time to run this again

ksajan commented 2 months ago

@ErikKaum This is the entire logs

2024-08-29T15:18:55.265433Z  INFO hf_hub: Token file not found "/root/.cache/huggingface/token"
2024-08-29T15:18:55.975886Z  INFO text_generation_launcher: Default `max_input_tokens` to 2047
2024-08-29T15:18:55.975910Z  INFO text_generation_launcher: Default `max_total_tokens` to 2048
2024-08-29T15:18:55.975913Z  INFO text_generation_launcher: Default `max_batch_prefill_tokens` to 2097
2024-08-29T15:18:55.976040Z  INFO download: text_generation_launcher: Starting check and download process for lmsys/vicuna-7b-v1.3
2024-08-29T15:19:06.113813Z  WARN text_generation_launcher: No safetensors weights found for model lmsys/vicuna-7b-v1.3 at revision None. Downloading PyTorch weights.
2024-08-29T15:19:06.352633Z  INFO text_generation_launcher: Download file: pytorch_model-00001-of-00002.bin

2024-08-29T15:46:52.583906Z  INFO text_generation_launcher: Downloaded /data/models--lmsys--vicuna-7b-v1.3/snapshots/236eeeab96f0dc2e463f2bebb7bb49809279c6d6/pytorch_model-00001-of-00002.bin in 0:27:46.
2024-08-29T15:46:52.585304Z  INFO text_generation_launcher: Download: [1/2] -- ETA: 0:27:46
2024-08-29T15:46:52.585314Z  INFO text_generation_launcher: Download file: pytorch_model-00002-of-00002.bin
2024-08-29T15:56:35.235042Z  INFO text_generation_launcher: Downloaded /data/models--lmsys--vicuna-7b-v1.3/snapshots/236eeeab96f0dc2e463f2bebb7bb49809279c6d6/pytorch_model-00002-of-00002.bin in 0:09:42.
2024-08-29T15:56:35.235107Z  INFO text_generation_launcher: Download: [2/2] -- ETA: 0
2024-08-29T15:56:35.239907Z  WARN text_generation_launcher: 🚨🚨BREAKING CHANGE in 2.0🚨🚨: Safetensors conversion is disabled without `--trust-remote-code` because Pickle files are unsafe and can essentially contain remote code execution!Please check for more information here: https://huggingface.co/docs/text-generation-inference/basic_tutorials/safety
2024-08-29T15:56:35.239965Z  WARN text_generation_launcher: No safetensors weights found for model lmsys/vicuna-7b-v1.3 at revision None. Converting PyTorch weights to safetensors.
2024-08-29T16:05:06.484589Z ERROR download: text_generation_launcher: Download process was signaled to shutdown with signal 9:
2024-08-29 15:19:02.076 | INFO     | text_generation_server.utils.import_utils:<module>:75 - Detected system ipex
/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/sgmv.py:18: UserWarning: Could not import SGMV kernel from Punica, falling back to loop.
  warnings.warn("Could not import SGMV kernel from Punica, falling back to loop.")
Error: DownloadError
ErikKaum commented 1 month ago

Thanks a lot for providing this info 👍

So one thing that's good mentioning is that Vicuna isn't a supported model. We try to support other ones by falling back to the transformers implementation, but it's not as smooth and there's a performance hit.

However here the error seems to be Download process was signaled to shutdown with signal 9: which is the unix SIGKILL signal. So there's definitely something going wrong in getting the model weights.

Similarly above the log line

2024-08-29T15:56:35.239907Z  WARN text_generation_launcher: 🚨🚨BREAKING CHANGE in 2.0🚨🚨: Safetensors conversion is disabled without `--trust-remote-code` because Pickle files are unsafe and can essentially contain remote code execution!Please check for more information here: https://huggingface.co/docs/text-generation-inference/basic_tutorials/safety

indicates that you probably should pass the --trust-remote-code flag since your getting weights that aren't in a safe format.

And similarly as the other issue: without GPU, running this model might be a bit tough.

Hopefully these help 👍