Closed ArnaudHureaux closed 11 months ago
Try disabling it ? It should still download the model just a bit slower.
hf_transfer
is really barebones, and any flaky network might trigger issues for you (or because you're using much more resources sometimes other part of the infra, not necessarily yours might start to lag down causing flaky network overall). Using the raw python download is definitely more recommended for stable downloading.
This works for me
docker run --shm-size 1g --net=host -p 8080:80 -v $PWD/data:/data -e HUGGING_FACE_HUB_TOKEN=$token -e HF_HUB_ENABLE_HF_TRANSFER=0 ghcr.io/huggingface/text-generation-inference:latest --model-id TheBloke/Llama-2-13B-chat-GGML --quantize bitsandbytes
how can i avoid this error. i am using aws sagemaker.
Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0
Is what you are looking for.
Isn't there a way for you to provide environement variables ?
HF_HUB_ENABLE_HF_TRANSFER=0
Is what you are looking for.
Hello Narsil,
I have a similar problem to majidbhatti. I am on a SageMaker notebook and I get the same error. I tried your solution by setting os.environ["HF_HUB_ENABLE_HF_TRANSFER"]="0" but it did not change anything, I still get the exact same error "RuntimeError: An error occurred while downloading using hf_transfer
. Consider disabling HF_HUB_ENABLE_HF_TRANSFER for better error handling.".
Do you have any suggestions?
I avoided this error using
hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }
huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )
I avoided this error using
hub = { 'HF_HUB_ENABLE_HF_TRANSFER': 0 }
huggingface_model = HuggingFaceModel( image_uri=get_huggingface_llm_image_uri("huggingface",version="0.8.2"), env=hub )
Thank you so much!! It worked for me.
Thanks for sharing the solution !
Closing this then
You can also add HF_HUB_ENABLE_HF_TRANSFER=0 in the docker command,
docker run --shm-size 1g --env HF_HUB_ENABLE_HF_TRANSFER=0 .......
System Info
I'am on a Ubuntu server of https://console.paperspace.com/ with this 2 A100 GPU, but when i run the model CalderaAI/30B-Lazarus i don't cannot use the HF Transfer, even with the "--net=host" (solution working with other models, find on another issue)
Full description of my GPUs : paperspace@pse55xf0v:~/text-generation-inference$ nvidia-smi Thu Jun 15 10:00:25 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 515.105.01 Driver Version: 515.105.01 CUDA Version: 11.7 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... Off | 00000000:00:05.0 Off | 0 | | N/A 29C P0 53W / 400W | 184MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ | 1 NVIDIA A100-SXM... Off | 00000000:00:06.0 Off | 0 | | N/A 26C P0 53W / 400W | 4MiB / 81920MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1497 G /usr/lib/xorg/Xorg 74MiB | | 0 N/A N/A 2389 G /usr/bin/gnome-shell 102MiB |
I run :
My error :
Information
Tasks
Reproduction
Run :
On a paperspace server on Ubuntu with 2 A100 GPUs
Expected behavior
To download the Lazarus model with the HF transfer