HF Spaces inference failing with error: huggingface_hub.utils._errors.LocalEntryNotFoundError

rileybolen commented 1 month ago

Describe the bug

I have set up a basic HF Space from an AutoTrain object detection model. The model is based on facebook/detr-resnet-101. The space builds and loads properly, but when i submit an image for inference it fails with the error I have added to the logs.

This is my app.py file, and I have not modified or added any other files.

import os
import gradio as gr

gr.load("models/rileybol/autotrain-1hkeo-o33ms", hf_token=os.environ.get('HF_TOKEN')).launch()

Reproduction

No response

Logs

===== Application Startup at 2024-05-24 17:49:05 =====

Fetching model from: https://huggingface.co/rileybol/autotrain-1hkeo-o33ms
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/rileybol/autotrain-1hkeo-o33ms

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 528, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 270, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1908, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1485, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 808, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gradio/external.py", line 371, in query_huggingface_inference_endpoints
    data = fn(*data)  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/gradio/external_utils.py", line 174, in object_detection_inner
    annotations = client.object_detection(input)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 1314, in object_detection
    response = self.post(data=image, model=model, task="object-detection")
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 273, in post
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status
    raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/rileybol/autotrain-1hkeo-o33ms (Request ID: 7vFuJSJB1qsRU2RQusCE7)

Could not load model rileybol/autotrain-1hkeo-o33ms with any of the following classes: (<class 'transformers.models.detr.modeling_detr.DetrForObjectDetection'>,). See the original errors:

while loading with DetrForObjectDetection, an error is thrown:
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 408, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 78, in send
    raise OfflineModeIsEnabled(
huggingface_hub.utils._http.OfflineModeIsEnabled: Cannot reach https://huggingface.co/timm/resnet101.a1h_in1k/resolve/main/pytorch_model.bin: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 283, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 1479, in __init__
    self.model = DetrModel(config)
                 ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 1311, in __init__
    backbone = DetrConvEncoder(config)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 353, in __init__
    backbone = create_model(
               ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_factory.py", line 117, in create_model
    model = create_fn(
            ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/resnet.py", line 1362, in resnet101
    return _create_resnet('resnet101', pretrained, **dict(model_args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/resnet.py", line 584, in _create_resnet
    return build_model_with_cfg(ResNet, variant, pretrained, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_builder.py", line 410, in build_model_with_cfg
    load_pretrained(
  File "/usr/local/lib/python3.11/site-packages/timm/models/_builder.py", line 190, in load_pretrained
    state_dict = load_state_dict_from_hf(pretrained_loc)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_hub.py", line 188, in load_state_dict_from_hf
    cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_********)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1371, in hf_hub_download
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/rileybol/autotrain-1hkeo-o33ms

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 528, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 270, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1908, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1485, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2144, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 808, in wrapper
    response = f(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gradio/external.py", line 371, in query_huggingface_inference_endpoints
    data = fn(*data)  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/gradio/external_utils.py", line 174, in object_detection_inner
    annotations = client.object_detection(input)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 1314, in object_detection
    response = self.post(data=image, model=model, task="object-detection")
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/inference/_client.py", line 273, in post
    hf_raise_for_status(response)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status
    raise HfHubHTTPError(str(e), response=response) from e
huggingface_hub.utils._errors.HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://api-inference.huggingface.co/models/rileybol/autotrain-1hkeo-o33ms (Request ID: hglHrlyIAKP2NgfZINp7A)

Could not load model rileybol/autotrain-1hkeo-o33ms with any of the following classes: (<class 'transformers.models.detr.modeling_detr.DetrForObjectDetection'>,). See the original errors:

while loading with DetrForObjectDetection, an error is thrown:
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1238, in hf_hub_download
    metadata = get_hf_file_metadata(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1631, in get_hf_file_metadata
    r = _request_wrapper(
        ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
    response = _request_wrapper(
               ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 408, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_http.py", line 78, in send
    raise OfflineModeIsEnabled(
huggingface_hub.utils._http.OfflineModeIsEnabled: Cannot reach https://huggingface.co/timm/resnet101.a1h_in1k/resolve/main/pytorch_model.bin: offline mode is enabled. To disable it, please unset the `HF_HUB_OFFLINE` environment variable.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/transformers/pipelines/base.py", line 283, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/modeling_utils.py", line 3550, in from_pretrained
    model = cls(config, *model_args, **model_kwargs)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 1479, in __init__
    self.model = DetrModel(config)
                 ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 1311, in __init__
    backbone = DetrConvEncoder(config)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/transformers/models/detr/modeling_detr.py", line 353, in __init__
    backbone = create_model(
               ^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_factory.py", line 117, in create_model
    model = create_fn(
            ^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/resnet.py", line 1362, in resnet101
    return _create_resnet('resnet101', pretrained, **dict(model_args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/resnet.py", line 584, in _create_resnet
    return build_model_with_cfg(ResNet, variant, pretrained, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_builder.py", line 410, in build_model_with_cfg
    load_pretrained(
  File "/usr/local/lib/python3.11/site-packages/timm/models/_builder.py", line 190, in load_pretrained
    state_dict = load_state_dict_from_hf(pretrained_loc)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/timm/models/_hub.py", line 188, in load_state_dict_from_hf
    cached_file = hf_hub_download(hf_model_id, filename=filename, revision=hf_********)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/huggingface_hub/file_download.py", line 1371, in hf_hub_download
    raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

System info

This is running in Hugging Face Spaces, on the default CPU hardware.

rileybolen commented 1 month ago

I forgot to mention, I also tried to use the Serverless Inference API, through the model card widget, and locally on my computer, and both of these methods gave me the same error as above.

Wauplin commented 1 month ago

Hi @rileybolen, would it be ok to make this model public and retry? https://huggingface.co/rileybol/autotrain-1hkeo-o33ms I want to confirm if it is a problem related to permissions or not. Also having it public would help me reproduce the error. If it's too confidential, let me know and we will figure out something.

rileybolen commented 1 month ago

Hi @rileybolen, would it be ok to make this model public and retry? https://huggingface.co/rileybol/autotrain-1hkeo-o33ms I want to confirm if it is a problem related to permissions or not. Also having it public would help me reproduce the error. If it's too confidential, let me know and we will figure out something.

Hi, sorry I missed this last week. I have now made that model public for your debugging. I think the issue is related specifically to DETR/ResNet, because I tried training a new model with hustvl/yolos-tiny instead of facebook/detr-resnet-101 and that one works well for me, so I don't really need to solve this issue anymore, but still let me know if you need any help solving it!

Wauplin commented 1 month ago

Looks like making it public also made the InferenceAPI work properly. Just tried running the "Palace" example for instance and it worked on serverless API.

huggingface / huggingface_hub