huggingface / optimum

🚀 Accelerate training and inference of 🤗 Transformers and 🤗 Diffusers with easy to use hardware optimization tools
https://huggingface.co/docs/optimum/main/
Apache License 2.0
2.58k stars 470 forks source link

onnx exporter does NOT use cache_dir with task=auto #1104

Closed escorciav closed 1 year ago

escorciav commented 1 year ago

System Info

Python 3.10.11
optimum==1.8.6
Ubuntu x86_64

Who can help?

@michaelbenayoun

Information

Tasks

Reproduction

  1. Disable/Block any internet connection (e.g., disconnect wifi, or block internet connections via Docker, etc.)
  2. optimum-cli export onnx --model t5-base checkpoints/t5-base_onnx/ --framework pt --optimize O3 --batch_size 1 --sequence_length 512 --atol 0.0001 --cache_dir ~/.cache/huggingface/hub

Ask task='auto' (default), the code gets stuck here sending a request to HF servers to infer something from online catalog.

IMHO, a patch is needed.

Feel free to close the issue if out of the scope. More details are provided here

Expected behavior

The script should run without checking the record in HF server. Especially, as HF blocks IPs.

fxmarty commented 1 year ago

Hi, should be fixed in https://github.com/huggingface/optimum/pull/1109 with better error messages. You indeed need to pass --task when exporting offline.

escorciav commented 1 year ago

Thanks for the fix. No passing --task still produces an unclear error message IMHO.

I used a fresh conda environment with Python=3.11 + pip

pip install onnx 
pip install git+https://github.com/huggingface/optimum.git@main
$ optimum-cli export onnx --model t5-base favor-hf-delete-me --framework pt --optimize O3 --batch_size 1 --sequence_length 512 --atol 0.0001 --cache_dir ~/.cache/huggingface/hub
Traceback (most recent call last):
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 259, in hf_raise_for_status
    response.raise_for_status()
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/requests/models.py", line 1021, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/models/t5-base

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/apps/install/bin/miniconda3/envs/hf-optimum/bin/optimum-cli", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/optimum/commands/optimum_cli.py", line 163, in main
    service.run()
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/optimum/commands/export/onnx.py", line 219, in run
    main_export(
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/optimum/exporters/onnx/__main__.py", line 173, in main_export
    task = TasksManager.infer_task_from_model(model_name_or_path)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/optimum/exporters/tasks.py", line 1363, in infer_task_from_model
    task = cls._infer_task_from_model_name_or_path(model, subfolder=subfolder, revision=revision)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/optimum/exporters/tasks.py", line 1300, in _infer_task_from_model_name_or_path
    model_info = huggingface_hub.model_info(model_name_or_path, revision=revision)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 1676, in model_info
    hf_raise_for_status(r)
  File "/apps/install/bin/miniconda3/envs/hf-optimum/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 301, in hf_raise_for_status
    raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 429 Client Error: Too Many Requests for url: https://huggingface.co/api/models/t5-base
fxmarty commented 1 year ago

Hi, maybe try pip uninstall optimum && pip install git+https://github.com/huggingface/optimum.git@main.

Could you tell me how to make it clearer? Currently, offline, from

optimum-cli export onnx --model fxmarty/tiny-llama-fast-tokenizer llama_onnx

we get

ConnectionError: The task could not be automatically inferred as this is available only for models hosted on the Hugging Face Hub. Please provide the 
argument --task with the relevant task from conversational, feature-extraction, fill-mask, text-generation, text2text-generation, text-classification, 
token-classification, multiple-choice, object-detection, question-answering, image-classification, image-segmentation, mask-generation, masked-im, 
semantic-segmentation, automatic-speech-recognition, audio-classification, audio-frame-classification, audio-xvector, image-to-text, stable-diffusion, 
zero-shot-image-classification, zero-shot-object-detection. Detailed error: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with 
url: /api/models/fxmarty/tiny-llama-fast-tokenizer (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f01166d82b0>: Failed to 
establish a new connection: [Errno -3] Temporary failure in name resolution'))

Specifying the task,

optimum-cli export onnx --model fxmarty/tiny-llama-fast-tokenizer llama_onnx --task text-generation-with-past

the export runs smoothly (aka the cache dir is used, the default being ~/.cache/huggingface/hub).

And if I pass a wrong cache dir,

optimum-cli export onnx --model fxmarty/tiny-llama-fast-tokenizer llama_onnx --task text-generation-with-past --cache_dir dummydir

I rightfully get an error:

OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like 
fxmarty/tiny-llama-fast-tokenizer is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.