Closed holma91 closed 5 months ago
Hi, did you try this more than once? I just tried it and it ran fine, and the error message does suggest it might be a transient issue, perhaps on the Hugging Face side. FWIW that config just uses
base_model: mistralai/Mistral-7B-v0.1
Which AFAIK hasn't gone anywhere :)
Tried it again now and I'm still getting it. Pretty sure my secrets are setup correctly aswell:
Here's the complete trace:
CUDA Version 12.1.0
Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.
WARNING: The NVIDIA Driver was not detected. GPU functionality will not be available.
Use the NVIDIA Container Toolkit to start this container with GPU support; see
https://docs.nvidia.com/datacenter/cloud-native/ .
*************************
** DEPRECATION NOTICE! **
*************************
THIS IMAGE IS DEPRECATED and is scheduled for DELETION.
https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/support-policy.md
Downloading mistralai/Mistral-7B-v0.1 ...
Traceback (most recent call last):
File "/root/src/train.py", line 90, in launch
snapshot_download(model_name, local_files_only=True)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 235, in snapshot_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 286, in hf_raise_for_status
response.raise_for_status()
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1/revision/main
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 179, in snapshot_download
repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2275, in repo_info
return method(
^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2085, in model_info
hf_raise_for_status(r)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py", line 333, in hf_raise_for_status
raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1/revision/main (Request ID: Root=1-6666ee5f-399cf03367d0ba5c0195c2f8;ade0d03f-5211-4dee-b4ae-fc1a9942aaf6)
Authorization error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/pkg/modal/_container_io_manager.py", line 492, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 378, in run_input_sync
res = finalized_function.callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/src/train.py", line 94, in launch
snapshot_download(model_name)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line 251, in snapshot_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the local disk. Please check your internet connection and try again.
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/lapuerta/dev/llm-finetuning/src/train.py:152 in main │
│ │
│ 151 │ with open(config, "r") as cfg, open(data, "r") as dat: │
│ ❱ 152 │ │ run_name, launch_handle = launch.remote( │
│ 153 │ │ │ cfg.read(), dat.read(), run_to_resume, preproc_only │
│ │
│ /Users/lapuerta/dev/llm-finetuning/env/lib/python3.11/site-packages/modal/object.py:230 in │
│ wrapped │
│ │
│ 229 │ │ await self.resolve() │
│ ❱ 230 │ │ return await method(self, *args, **kwargs) │
│ 231 │
│ │
│ /Users/lapuerta/dev/llm-finetuning/env/lib/python3.11/site-packages/modal/functions.py:987 in │
│ remote │
│ │
│ 986 │ │ │
│ ❱ 987 │ │ return await self._call_function(args, kwargs) │
│ 988 │
│ │
│ /Users/lapuerta/dev/llm-finetuning/env/lib/python3.11/site-packages/modal/functions.py:949 in │
│ _call_function │
│ │
│ 948 │ │ try: │
│ ❱ 949 │ │ │ return await invocation.run_function() │
│ 950 │ │ except asyncio.CancelledError: │
│ │
│ /Users/lapuerta/dev/llm-finetuning/env/lib/python3.11/site-packages/modal/functions.py:170 in │
│ run_function │
│ │
│ 169 │ │ assert not item.result.gen_status │
│ ❱ 170 │ │ return await _process_result(item.result, item.data_format, self.stub, self.clie │
│ 171 │
│ │
│ /Users/lapuerta/dev/llm-finetuning/env/lib/python3.11/site-packages/modal/_utils/function_utils. │
│ py:375 in _process_result │
│ │
│ 374 │ │ │ except Exception as deser_exc: │
│ ❱ 375 │ │ │ │ raise ExecutionError( │
│ 376 │ │ │ │ │ "Could not deserialize remote exception due to local error:\n" │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ExecutionError: Could not deserialize remote exception due to local error:
Deserialization failed because the 'huggingface_hub' module is not available in the local environment.
This can happen if your local environment does not have the remote exception definitions.
Here is the remote traceback:
Traceback (most recent call last):
File "/root/src/train.py", line 90, in launch
snapshot_download(model_name, local_files_only=True)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py",
line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File
"/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line
235, in snapshot_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder
for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo
look-ups and downloads online, pass 'local_files_only=False' as input.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py",
line 286, in hf_raise_for_status
response.raise_for_status()
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/requests/models.py", line 1024, in
raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url:
https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1/revision/main
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File
"/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line
179, in snapshot_download
repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py",
line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2275,
in repo_info
return method(
^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py",
line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/hf_api.py", line 2085,
in model_info
hf_raise_for_status(r)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_errors.py",
line 333, in hf_raise_for_status
raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 403 Client Error: Forbidden for url:
https://huggingface.co/api/models/mistralai/Mistral-7B-v0.1/revision/main (Request ID:
Root=1-6666ee5f-399cf03367d0ba5c0195c2f8;ade0d03f-5211-4dee-b4ae-fc1a9942aaf6)
Authorization error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/pkg/modal/_container_io_manager.py", line 492, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 378, in run_input_sync
res = finalized_function.callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/src/train.py", line 94, in launch
snapshot_download(model_name)
File "/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py",
line 118, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File
"/root/miniconda3/envs/py3.11/lib/python3.11/site-packages/huggingface_hub/_snapshot_download.py", line
251, in snapshot_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the
files on the Hub and we cannot find the appropriate snapshot folder for the specified revision on the
local disk. Please check your internet connection and try again.
My bad, the problem was the token permission on huggingface. Had to set "Read access to contents of all public gated repos you can access" to true.
Thanks for following up!
Steps to reproduce:
git clone https://github.com/modal-labs/llm-finetuning.git
python3 -m venv env && . env/bin/activate && pip install modal
modal run --detach src.train --config=config/mistral-memorize.yml --data=data/sqlqa.subsample.jsonl
Results in:
If using some other config than
config/mistral-memorize.yml
likeconfig/llama-3.yml
, everything works fine.Don't think it should matter but I'm using Python 3.11 on a Macbook M1.