huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.14k stars 559 forks source link

`huggingface-cli` fails to download model because of missing `.incomplete` file #2374

Closed mlinke-ai closed 4 months ago

mlinke-ai commented 4 months ago

Describe the bug

The huggingface-cli fails to download the microsoft/phi-3-mini-4k-instruct-onnx model because the .incomplete file of the .onnx data file is missing.

I assume the file should be created during download to track progress and later resume aborted downloads.

Reproduction

Run the following command:

huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include "cpu_and_mobile/cpu-int4-rtn-block-32/*" --local-dir models/phi-3-mini-4k-instruct-onnx

Logs

Traceback (most recent call last):
  File "C:\Program Files\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Program Files\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\Scripts\huggingface-cli.exe\__main__.py", line 7, in <module>
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\commands\huggingface_cli.py", line 51, in main
    service.run()
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\commands\download.py", line 146, in run
    print(self._download())  # Print path to downloaded files
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\commands\download.py", line 180, in _download
    return snapshot_download(
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\_snapshot_download.py", line 294, in snapshot_download
    thread_map(
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\tqdm\contrib\concurrent.py", line 69, in thread_map
    return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\tqdm\contrib\concurrent.py", line 51, in _executor_map
    return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\tqdm\std.py", line 1181, in __iter__
    for obj in iterable:
  File "C:\Program Files\Python310\lib\concurrent\futures\_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "C:\Program Files\Python310\lib\concurrent\futures\_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "C:\Program Files\Python310\lib\concurrent\futures\_base.py", line 451, in result
    return self.__get_result()
  File "C:\Program Files\Python310\lib\concurrent\futures\_base.py", line 403, in __get_result
    raise self._exception
  File "C:\Program Files\Python310\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\_snapshot_download.py", line 268, in _inner_hf_hub_download
    return hf_hub_download(
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\file_download.py", line 1202, in hf_hub_download
    return _hf_hub_download_to_local_dir(
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\file_download.py", line 1487, in _hf_hub_download_to_local_dir
    _download_to_tmp_and_move(
  File "C:\Users\mlinke\AppData\Roaming\Python\Python310\site-packages\huggingface_hub\file_download.py", line 1872, in _download_to_tmp_and_move
    with incomplete_path.open("ab") as f:
  File "C:\Program Files\Python310\lib\pathlib.py", line 1119, in open
    return self._accessor.open(self, mode, buffering, encoding, errors,
FileNotFoundError: [Errno 2] No such file or directory: 'models\\phi-3-mini-4k-instruct-onnx\\.huggingface\\download\\cpu_and_mobile\\cpu-int4-rtn-block-32\\phi3-mini-4k-instruct-cpu-int4-rtn-block-32.onnx.1e1faf7ea6930f63caab12412f4a82c329eaddf6cce365e45c3cd00bb0547be8.incomplete'

System info

- huggingface_hub version: 0.23.4
- Platform: Windows-10-10.0.22631-SP0
- Python version: 3.10.8
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: C:\Users\mlinke\.cache\huggingface\token
- Has saved token ?: False
- Configured git credential helpers: manager
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.3.1
- Jinja2: 3.1.4
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: 10.4.0
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.8.2
- aiohttp: 3.9.5
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: C:\Users\mlinke\.cache\huggingface\hub
- HF_ASSETS_CACHE: C:\Users\mlinke\.cache\huggingface\assets
- HF_TOKEN_PATH: C:\Users\mlinke\.cache\huggingface\token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
Wauplin commented 4 months ago

Hi @mlinke-ai , sorry you're facing this issue. Could you check that folder models\\phi-3-mini-4k-instruct-onnx\\.huggingface\\download\\cpu_and_mobile\\cpu-int4-rtn-block-32\\ exists on your disk? If yes, I'm very surprised that the incomplete_path cannot be opened given that the "ab" flag when opening it should create it if it is missing. If the folder path does not exist, I would also be surprised but for a different reason (it shouldn't be able to reach this point in code without creating the folder :confused:)

mlinke-ai commented 4 months ago

Yes, the folder exists. It also contains all the .metadata files of the configs and tokenizer.

Wauplin commented 4 months ago

@mlinke-ai what happens if you run this snippet:

from pathlib import Path
path = Path("models\\phi-3-mini-4k-instruct-onnx\\.huggingface\\download\\cpu_and_mobile\\cpu-int4-rtn-block-32\\phi3-mini-4k-instruct-cpu-int4-rtn-block-32.onnx.1e1faf7ea6930f63caab12412f4a82c329eaddf6cce365e45c3cd00bb0547be8.incomplete")

with path.open("ab") as f:
    pass

Do you also get the error?

mlinke-ai commented 4 months ago

Yes, the problem still occurs.

I experimented with different paths and different file names and the problem seems to be the path length in characters. Simply for testing purposes, I reduced the length of the hex code section in the file name to 58 characters (originally 64) and it worked. When using 59 characters or more in this section the error comes back.

Wauplin commented 4 months ago

Interesting, it looks like you are hitting the Windows max file path limit (see docs). I just remembered we've seen this issue before. Here is how to fix it by prepending "\\?\" to the absolute path. Would you like to open a PR to fix this for the _hf_hub_download_to_local_dir path?

mlinke-ai commented 4 months ago

Thanks, I didn't think about the maximum path length at all.

I will fix the problem and create a PR in the next days.

Have a great day!

Wauplin commented 4 months ago

Thanks @mlinke-ai! Have a nice day too :)

Wauplin commented 4 months ago

Closed by https://github.com/huggingface/huggingface_hub/pull/2378 thanks to @mlinke-ai!

Leili commented 2 months ago

Hi, I'm also facing this issue on a cluster. I download "whisper-medium" using huggingface-cli , and when loading the model it's looking for a missing ".incomplete" file. I've also tried downloading using "git clone" on my ubuntu machine (on Desktop to have a shorter path) and then copy it over, but the error occurs regardless... Any help would be really appreciated!

Wauplin commented 2 months ago

@Leili sorry for the inconvenience. Can you open a new issue describing:

With more information it'll be easier to investigate the issue.