exo-explore / exo

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
GNU General Public License v3.0
6.45k stars 337 forks source link

Can I manually download model files if my server cannot access huggingface.co? #80

Open artistlu opened 1 month ago

artistlu commented 1 month ago

I'm encountering an issue when trying to use the Exor project. It appears that my server is unable to access the huggingface.co domain, which is preventing me from downloading the required model files.

Is there a way for me to manually download the model files and then place them in the appropriate directory on my server? If so, could you please provide the steps or the specific directory path where I should put the downloaded model files?

Alternatively, do you have any other suggestions on how I can work around this issue and successfully use the Exo on my server?

Trying AutoTokenizer for llama3-8b-sfr
Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connection.py", line 196, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/util/connection.py", line 85, in 
create_connection
    raise err
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/util/connection.py", line 73, in 
create_connection
    sock.connect(sa)
TimeoutError: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connectionpool.py", line 490, in 
_make_request
    raise new_e
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connectionpool.py", line 466, in 
_make_request
    self._validate_conn(conn)
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connectionpool.py", line 1095, in 
_validate_conn
    conn.connect()
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connection.py", line 615, in connect
    self.sock = sock = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connection.py", line 205, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x7f898a24e0>, 'Connection to
huggingface.co timed out. (connect timeout=10)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with 
url: /llama3-8b-sfr/resolve/main/config.json (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection 
object at 0x7f898a24e0>, 'Connection to huggingface.co timed out. (connect timeout=10)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1722, in 
_get_metadata_or_catch_error
    metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1645, in 
get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 372, in 
_request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 395, in 
_request_wrapper
    response = get_session().request(method=method, url=url, **params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 66, in send
    return super().send(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/adapters.py", line 688, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max 
retries exceeded with url: /llama3-8b-sfr/resolve/main/config.json (Caused by 
ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f898a24e0>, 'Connection to huggingface.co timed
out. (connect timeout=10)'))"), '(Request ID: a365ba44-4989-431d-ae60-d87bce245017)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 399, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1221, in 
hf_hub_download
    return _hf_hub_download_to_cache_dir(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1325, in 
_hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1826, in 
_raise_on_head_call_error
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub 
and we cannot find the requested files in the local cache. Please check your connection and try again or make sure 
your Internet connection is on.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 57, in resolve_tokenizer
    return AutoTokenizer.from_pretrained(model_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 
837, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line
934, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 632, in 
get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 689, in 
_get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 442, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and
it looks like llama3-8b-sfr is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 
'https://huggingface.co/docs/transformers/installation#offline-mode'.

Failed to load tokenizer for llama3-8b-sfr. Falling back to tinygrad tokenizer
Trying tinygrad tokenizer for llama3-8b-sfr
stephanj commented 1 month ago

Here are some suggestions to work around this issue:

  1. Manual Download and Placement: You can manually download the model files from Hugging Face when you have access to a network that can reach huggingface.co. Then, you can transfer these files to your server. The typical directory structure for pretrained models is:

    ~/.cache/huggingface/hub/

    Within this directory, create a folder structure that mirrors the model's name on Hugging Face. For example:

    ~/.cache/huggingface/hub/models--llama3-8b-sfr/

    Place the downloaded files (like config.json, model.safetensors, tokenizer.json, etc.) in this directory.

  2. Use Offline Mode: If you've manually placed the files as described above, you can use Transformers' offline mode. Set the following environment variable before running your script:

    export TRANSFORMERS_OFFLINE=1

    This tells the library to only look for local files and not try to download anything.

  3. Local Model: If you have the model files in a local directory, you can specify the path directly instead of using the model ID:

    from transformers import AutoTokenizer
    tokenizer = AutoTokenizer.from_pretrained("/path/to/your/local/model/directory")
  4. Use a Mirror: If your server can access other domains, you might be able to use a Hugging Face mirror. You can specify a custom endpoint when loading the model:

    from transformers import AutoTokenizer
    tokenizer = AutoTokenizer.from_pretrained("llama3-8b-sfr", use_auth_token=True, endpoint="https://your-mirror-url.com")
  5. Proxy Configuration: If your server requires a proxy to access external resources, you can configure it:

    import os
    os.environ['HTTP_PROXY'] = 'http://your-proxy:your-port'
    os.environ['HTTPS_PROXY'] = 'https://your-proxy:your-port'

    Set these before running your script.

  6. Network Troubleshooting: Ensure that your server's firewall isn't blocking outgoing connections to huggingface.co. You might need to whitelist this domain in your network settings.

artistlu commented 1 month ago

Based on my understanding, I've placed the model files in the ~/.cache/huggingface/hub/ directory, and I've also set up some symbolic links. However, I'm still getting an error.

The model download link is: https://www.modelscope.cn/models/LLM-Research/Meta-Llama-3-8B

(base) root@linaro-alip:~/.cache/huggingface/hub# pwd
/root/.cache/huggingface/hub
(base) root@linaro-alip:~/.cache/huggingface/hub# ls -alh
drwxr-xr-x 2 root root 4.0K  7月 25 22:19 .
drwxr-xr-x 3 root root 4.0K  7月 25 09:53 ..
lrwxrwxrwx 1 root root   32  7月 25 21:59 llama3-8b-sfr -> /nasroot/modules/Meta-Llama-3-8B
lrwxrwxrwx 1 root root   32  7月 25 21:43 Meta-Llama-3-8B -> /nasroot/modules/Meta-Llama-3-8B
lrwxrwxrwx 1 root root   32  7月 25 21:54 models--llama3-8b-sfr -> /nasroot/modules/Meta-Llama-3-8B
lrwxrwxrwx 1 root root   32  7月 25 22:19 models--Meta-Llama-3-8B -> /nasroot/modules/Meta-Llama-3-8B
-rw-r--r-- 1 root root    1  7月 25 09:53 version.txt
Trying AutoTokenizer for llama3-8b-sfr
/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: 
`resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you 
want to force a new download, use `force_download=True`.
  warnings.warn(
Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 399, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1221, in 
hf_hub_download
    return _hf_hub_download_to_cache_dir(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1325, in 
_hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1817, in 
_raise_on_head_call_error
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and 
outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 57, in resolve_tokenizer
    return AutoTokenizer.from_pretrained(model_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 
837, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line
934, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 632, in 
get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 689, in 
_get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 442, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and
it looks like llama3-8b-sfr is not the path to a directory containing a file named config.json.
Checkout your internet connection or see how to run the library in offline mode at 
'https://huggingface.co/docs/transformers/installation#offline-mode'.

Failed to load tokenizer for llama3-8b-sfr. Falling back to tinygrad tokenizer
Trying tinygrad tokenizer for llama3-8b-sfr
Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 399, in cached_file
    resolved_file = hf_hub_download(
                    ^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1221, in 
hf_hub_download
    return _hf_hub_download_to_cache_dir(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1325, in 
_hf_hub_download_to_cache_dir
    _raise_on_head_call_error(head_call_error, force_download, local_files_only)
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1817, in 
_raise_on_head_call_error
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and 
outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 65, in resolve_tokenizer
    return resolve_tinygrad_tokenizer(model_id)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 48, in resolve_tinygrad_tokenizer
    return AutoTokenizer.from_pretrained("TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-8B-R")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 
837, in from_pretrained
    config = AutoConfig.from_pretrained(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/models/auto/configuration_auto.py", line
934, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 632, in 
get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/configuration_utils.py", line 689, in 
_get_config_dict
    resolved_config_file = cached_file(
                           ^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/transformers/utils/hub.py", line 442, in cached_file
    raise EnvironmentError(
OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and
it looks like TriAiExperiments/SFR-Iterative-DPO-LLaMA-3-8B-R is not the path to a directory containing a file named
config.json.
Checkout your internet connection or see how to run the library in offline mode at 
'https://huggingface.co/docs/transformers/installation#offline-mode'.

Failed again to load tokenizer for llama3-8b-sfr. Falling back to mlx tokenizer
Trying mlx tokenizer for llama3-8b-sfr
Error handling request
Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 164, in 
snapshot_download
    repo_info = api.repo_info(repo_id=repo_id, repo_type=repo_type, revision=revision, token=token)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 2491, in repo_info
    return method(
           ^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 2300, in model_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 77, in send
    raise OfflineModeIsEnabled(
huggingface_hub.errors.OfflineModeIsEnabled: Cannot reach 
https://huggingface.co/api/models/llama3-8b-sfr/revision/main: offline mode is enabled. To disable it, please unset 
the `HF_HUB_OFFLINE` environment variable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/aiohttp/web_protocol.py", line 452, in 
_handle_request
    resp = await request_handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/aiohttp/web_app.py", line 543, in _handle
    resp = await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 175, in handle_post_chat_completions
    tokenizer = await resolve_tokenizer(shard.model_id)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhanglu/code/exo/exo/api/chatgpt_api.py", line 73, in resolve_tokenizer
    return load_tokenizer(await get_model_path(model_id))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/zhanglu/code/exo/exo/inference/mlx/sharded_utils.py", line 176, in get_model_path
    await snapshot_download_async(
  File "/home/zhanglu/code/exo/exo/inference/mlx/sharded_utils.py", line 158, in snapshot_download_async
    return await asyncio.get_event_loop().run_in_executor(None, func)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in 
_inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/miniconda3/envs/exo/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 226, in 
snapshot_download
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder for the 
specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads 
online, set 'HF_HUB_OFFLINE=0' as environment variable.

@stephanj

stephanj commented 1 month ago

Looks like you're missing some sub-directories : "Cannot find an appropriate cached snapshot folder"

Should be the following directory structure:

models|-mlx-community--Meta-Llama-3.1-8B-Instruct-4bit
        blobs
        refs
        snapshots
          efc01dc1fd006f88344400c099cda5b3e8e524ef