The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

ArnaudHureaux commented 1 year ago

System Info

I'am on a Ubuntu server of https://console.paperspace.com/ with this 2 A100 GPU, but when i run the model CalderaAI/30B-Lazarus i don't cannot use the HF Transfer, even with the "--net=host" (solution working with other models, find on another issue)

I run :

sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels

My error :

HF_TRANSFER for Lararus model

2023-06-15T10:56:36.180341Z ERROR download: text_generation_launcher: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.

2023-06-15T10:56:36.180378Z  INFO download: text_generation_launcher: Retry 4/4 

2023-06-15T10:56:36.180458Z  INFO download: text_generation_launcher: Download file: pytorch_model-00003-of-00007.bin

2023-06-15T10:57:00.401367Z ERROR text_generation_launcher: Download encountered

an error: Traceback (most recent call last):

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 486, in http_get

    download(url, temp_file.name, max_files, chunk_size, headers=headers)       

Exception: Error while downloading: Os { code: 28, kind: StorageFull, message: "No space left on device" }

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "/opt/conda/bin/text-generation-server", line 8, in <module>

    sys.exit(app())

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/cli.py", line 137, in download_weights

    local_pt_files = utils.download_weights(pt_filenames, model_id, revision)   

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 167, in download_weights

    file = download_file(filename)

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 159, in download_file

    raise e

  File "/opt/conda/lib/python3.9/site-packages/text_generation_server/utils/hub.py", line 147, in download_file

    local_file = hf_hub_download(

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/utils/_validators.py", line 120, in _inner_fn

    return fn(*args, **kwargs)

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 1347, in hf_hub_download

    http_get(

  File "/opt/conda/lib/python3.9/site-packages/huggingface_hub/file_download.py", line 495, in http_get

    raise RuntimeError(

RuntimeError: An error occurred while downloading using `hf_transfer`. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.

Error: DownloadError

Information

[X] Docker
[ ] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

Run :

sudo docker run --net=host --gpus all --shm-size 1g -p 8080:80 -v $PWD/data:/data -e HF_HUB_ENABLE_HF_TRANSFER=1 ghcr.io/huggingface/text-generation-inference:0.8 --model-id CalderaAI/30B-Lazarus --num-shard 1 --env --disable-custom-kernels

On a paperspace server on Ubuntu with 2 A100 GPUs

Expected behavior

To download the Lazarus model with the HF transfer

Narsil commented 1 year ago

Try disabling it ? It should still download the model just a bit slower.

hf_transfer is really barebones, and any flaky network might trigger issues for you (or because you're using much more resources sometimes other part of the infra, not necessarily yours might start to lag down causing flaky network overall). Using the raw python download is definitely more recommended for stable downloading.

ksingh-scogo commented 12 months ago

This works for me

docker run --shm-size 1g --net=host -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=$token -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:latest  --model-id TheBloke/Llama-2-13B-chat-GGML --quantize bitsandbytes

majidbhatti commented 12 months ago

how can i avoid this error. i am using aws sagemaker.

Narsil commented 11 months ago

Isn't there a way for you to provide environement variables ?

HF_HUB_ENABLE_HF_TRANSFER=0

Is what you are looking for.

anastasia-enot commented 11 months ago

Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0
Is what you are looking for.

Hello Narsil, I have a similar problem to majidbhatti. I am on a SageMaker notebook and I get the same error. I tried your solution by setting os.environ["HF_HUB_ENABLE_HF_TRANSFER"]="0" but it did not change anything, I still get the exact same error "RuntimeError: An error occurred while downloading using hf_transfer. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.". Do you have any suggestions?

majidbhatti commented 11 months ago

I avoided this error using hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }

huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )

anastasia-enot commented 11 months ago

I avoided this error using hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }

huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )

Thank you so much!! It worked for me.

Narsil commented 11 months ago

Thanks for sharing the solution !

Closing this then

chintanckg commented 11 months ago

You can also add HF_HUB_ENABLE_HF_TRANSFER=0 in the docker command,

docker run --shm-size 1g --env HF_HUB_ENABLE_HF_TRANSFER=0 .......

huggingface / text-generation-inference