huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
1.96k stars 512 forks source link

super slow from_pretrained #1893

Closed rvorias closed 9 months ago

rvorias commented 9 months ago

Describe the bug

I have a script that sets up a couple of models for a pipeline.

            vae = AutoencoderKL.from_pretrained(
                "madebyollin/sdxl-vae-fp16-fix",  # this makes sure we don't have to upcast the VAE
                torch_dtype=torch.float16,  # always fp16
                use_safetensors=True,
                cache_dir=self.cache_dir,
            )
...
            self.pipeline.load_lora_weights(
                "latent-consistency/lcm-lora-sdxl",
                adapter_name="lcm",
                cache_dir=self.cache_dir,
            )

And it takes >1 min to load the VAE.

When I interrupt the process with Ctrl-C, it seems that it is hanging on some connection for validating the metadata.

I've tried to set # os.environ["HF_HUB_OFFLINE"] = "1" at the beginning of my script. Then the vae loads fast but it hangs on the lora weights.

I am 100% sure everything is cached in my cache dir and normally it went pretty fast.

Also nothing bad about my internet connection as I can download just fine from the huggingface hub.

Reproduction

import torch
from diffusers import AutoencoderKL, DiffusionPipeline

cache_dir = "/data/huggingface_cache"

vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix",
    torch_dtype=torch.float16,
    use_safetensors=True,
    cache_dir=cache_dir,
)

pipeline = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0",
    vae=vae,
    torch_dtype=torch.float16,
    variant="fp16",
    use_safetensors=True,
    cache_dir=cache_dir,
).to("cuda")

pipeline.load_lora_weights(
    "latent-consistency/lcm-lora-sdxl",
    adapter_name="lcm",
    cache_dir=cache_dir,
)

Logs

^CTraceback (most recent call last):
  File "/home/vd/projects/models/scripts/inference/run_inference_engine.py", line 18, in <module>
    with engine as engine:
  File "/home/vd/projects/models/src/inference_engine.py", line 393, in __enter__
    vae = AutoencoderKL.from_pretrained(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 704, in from_pretrained
    config, unused_kwargs, commit_hash = cls.load_config(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/diffusers/configuration_utils.py", line 370, in load_config
    config_file = hf_hub_download(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1247, in hf_hub_download
    metadata = get_hf_file_metadata(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1624, in get_hf_file_metadata
    r = _request_wrapper(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 402, in _request_wrapper
    response = _request_wrapper(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 425, in _request_wrapper
    response = get_session().request(method=method, url=url, **params)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 63, in send
    return super().send(request, *args, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
KeyboardInterrupt
make: *** [Makefile:91: run] Interrupt

System info

- huggingface_hub version: 0.19.4
- Platform: Linux-6.2.0-37-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /home/vd/.cache/huggingface/token
- Has saved token ?: False
- Configured git credential helpers: 
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.0.1+cu118
- Jinja2: 3.1.2
- Graphviz: N/A
- Pydot: N/A
- Pillow: 10.1.0
- hf_transfer: N/A
- gradio: 3.50.2
- tensorboard: N/A
- numpy: 1.26.2
- pydantic: 1.10.13
- aiohttp: 3.8.6
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /home/vd/.cache/huggingface/hub
- HF_ASSETS_CACHE: /home/vd/.cache/huggingface/assets
- HF_TOKEN_PATH: /home/vd/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
Wauplin commented 9 months ago

Hi @rvorias. I'm not sure why your script hangs when trying to start a connection. A few things we can try:

  1. Could you set HF_HUB_OFFLINE environment variable before launching the script, just in case this is not set at the correct place (HF_HUB_OFFLINE is evaluated only once when importing diffusers/huggingface_hub). Something like this should work:

    HF_HUB_OFFLINE=1 python my_script.py
  2. If it doesn't work, can you try setting local_file_only=True in your from_pretrained and load_lora_weights calls. This should be strictly the same as setting the environment variable but let's make sure it's the case.

  3. Independently of your script and your cache, could you try that fetching metadata for a file works correctly?

>>> from huggingface_hub import get_hf_file_metadata
>>> get_hf_file_metadata("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")
HfFileMetadata(commit_hash='a18548dd4956b174ec5b0d78d340c8dae0a129cd', etag='a764e6859b6e04047cd761c08ff0cee96413a8e004c9f07707530cd776b19141', location='https://cdn-lfs-us-1.huggingface.co/repos/1e/36/1e36aa70192fc07d27740eadda7a85e4fbc5175f93b73e5573844ab61a045dca/a764e6859b6e04047cd761c08ff0cee96413a8e004c9f07707530cd776b19141?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27pytorch_lora_weights.safetensors%3B+filename%3D%22pytorch_lora_weights.safetensors%22%3B&Expires=1702200295&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwMjIwMDI5NX19LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzFlLzM2LzFlMzZhYTcwMTkyZmMwN2QyNzc0MGVhZGRhN2E4NWU0ZmJjNTE3NWY5M2I3M2U1NTczODQ0YWI2MWEwNDVkY2EvYTc2NGU2ODU5YjZlMDQwNDdjZDc2MWMwOGZmMGNlZTk2NDEzYThlMDA0YzlmMDc3MDc1MzBjZDc3NmIxOTE0MT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=DY6mOtKcM-d%7EGZuHbKKmn0LGFy9WkWEPH432MCMfM5b7x7MhyRNfVzzLWOl9%7EATsQw%7Em3PSeHiXYY2TszuTMP7bZKqrrDMUCe-kzwq72TeLUGySwLzEp0lBXoafuftxf5I87e8hhsoeSrEBSkF2-fZYE3p%7ECrxOI9ijUIdsq3bT6L5WT6%7ErejPJoS-Pe9lDtUWWvx5xWC%7Evw1-2e2ucVyREsN9w1A8VFw4aOpcNogTYdHDRIesBC2e9HAe8qvjdG7XRLmOhmPa1V6D6it5AGKqHo84mjfRfA5tXzE1M58jakqYQrysdSec65d4Hb7oR-JT8MbNboe47ZEfvVFjv1fw__&Key-Pair-Id=KCD77M1F0VK2B', size=393855224)
rvorias commented 9 months ago
  1. HF_HUB_OFFLINE=1 python my_script.py -> same thing; vae loading super fast, hanging on the lora weights. Getting the same logging when doing ctrl+c.

  2. Other combinations: HF_HUB_OFFLINE=1 python my_script.py + local force lora -> fast VAE, hang on lora python my_script.py + local force VAE -> fast VAE, hang on lora python my_script.py + local force VAE + local force lora -> fast VAE, hang on lora

  3. Weird, I alsmost didn't have the patience for this function to resolve! >1 min o: Tried some another file as well, same problem.

E.g.: get_hf_file_metadata("https://huggingface.co/stabilityai/sdxl-turbo/resolve/main/sd_xl_turbo_1.0.safetensors")

I will try it on another computer in my network.

Other computer: fresh pyenv on a pretty fresh OS: same long waiting time!

So I cycled my ethernet connection (on-off-on): first try = fast second try = slow again ..

So seems something network related. Could it be something from cloudflare? I do have tailscale on my system, but I disabled it for the time being.

Wauplin commented 9 months ago

Thanks for trying it out. Results are weird though :confused:

Can you try:

  1. Run
    
    import requests

requests.head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")


and 

```py
from huggingface_hub.utils import get_session

response_1 = get_session().head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")
response_2 = get_session().head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")

and

curl --head https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors

to check if at least the HEAD calls work.

  1. Can you copy-paste the full stacktrace when it runs indefinitely on the lora step?

Thanks in advance. We will figure it out somehow :)

rvorias commented 9 months ago

Thanks for the help!

BTW I do have to clarify it does not run indefinitely, just > 1 min.

First script:

^CTraceback (most recent call last):
  File "/home/raph/projects/glif/hub-test/test2.py", line 3, in <module>
    requests.head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/api.py", line 100, in head
    return request("head", url, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
KeyboardInterrupt

Second script:

^CTraceback (most recent call last):
  File "/home/raph/projects/glif/hub-test/test3.py", line 3, in <module>
    response_1 = get_session().head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/sessions.py", line 624, in head
    return self.request("HEAD", url, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 63, in send
    return super().send(request, *args, **kwargs)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/raph/.cache/pypoetry/virtualenvs/hub-test-0Ii_uCV--py3.10/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
KeyboardInterrupt

Third script: (resolves immediately!)

curl --head https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors

HTTP/2 302 
content-type: text/plain; charset=utf-8
content-length: 1164
location: https://cdn-lfs-us-1.huggingface.co/repos/1e/36/1e36aa70192fc07d27740eadda7a85e4fbc5175f93b73e5573844ab61a045dca/a764e6859b6e04047cd761c08ff0cee96413a8e004c9f07707530cd776b19141?response-content-disposition=attachment%3B+filename*%3DUTF-8%27%27pytorch_lora_weights.safetensors%3B+filename%3D%22pytorch_lora_weights.safetensors%22%3B&Expires=1702203637&Policy=eyJTdGF0ZW1lbnQiOlt7IkNvbmRpdGlvbiI6eyJEYXRlTGVzc1RoYW4iOnsiQVdTOkVwb2NoVGltZSI6MTcwMjIwMzYzN319LCJSZXNvdXJjZSI6Imh0dHBzOi8vY2RuLWxmcy11cy0xLmh1Z2dpbmdmYWNlLmNvL3JlcG9zLzFlLzM2LzFlMzZhYTcwMTkyZmMwN2QyNzc0MGVhZGRhN2E4NWU0ZmJjNTE3NWY5M2I3M2U1NTczODQ0YWI2MWEwNDVkY2EvYTc2NGU2ODU5YjZlMDQwNDdjZDc2MWMwOGZmMGNlZTk2NDEzYThlMDA0YzlmMDc3MDc1MzBjZDc3NmIxOTE0MT9yZXNwb25zZS1jb250ZW50LWRpc3Bvc2l0aW9uPSoifV19&Signature=mXtcoTAfwCQtnE8ZP3VFT-vWUadcwg-NrY3L9-8Kgtw4so0KrYWn0TBDJlbd9bHaIbgZjQj726pKJ9I0u1gu-xEpO3WvJK9fIRdfs-jPaMc0lK4mob%7EctEYtthwR7aJj31kglTMOsn8mQlOBkc59F2u3AfdB2HP9LEgQtnF8n54FuCYSdQiBaRCJKWUjeOUzFJIBftlcKNpfmplQHbcG15hDUYaM9Ver7OpufMd3LQWDSmJLGv%7EME60vU9jOPMBx2j4agr57DDtQTLhSKSsmUxbQU8CzmlWAqoeLvkhKUayW3Rz0QZtErXHtNV9TD4o%7Eb1BZWZYti4wUhbBnEMb1Qg__&Key-Pair-Id=KCD77M1F0VK2B
date: Thu, 07 Dec 2023 10:39:31 GMT
x-powered-by: huggingface-moon
x-request-id: Root=1-6571a0e2-1943a44a0abb54415766954c
access-control-allow-origin: https://huggingface.co
vary: Origin, Accept
access-control-expose-headers: X-Repo-Commit,X-Request-Id,X-Error-Code,X-Error-Message,ETag,Link,Accept-Ranges,Content-Range
x-repo-commit: a18548dd4956b174ec5b0d78d340c8dae0a129cd
accept-ranges: bytes
x-linked-size: 393855224
x-linked-etag: "a764e6859b6e04047cd761c08ff0cee96413a8e004c9f07707530cd776b19141"
x-cache: Miss from cloudfront
via: 1.1 fb48b5d9efb59feb57513ac91c796648.cloudfront.net (CloudFront)
x-amz-cf-pop: BRU50-C1
x-amz-cf-id: ReE3lSzbZyxSGFwf6_pRIeqno1t9dU6-UHkkZXaNsOmOemxI5DRTVQ==
Wauplin commented 9 months ago

If

requests.head("https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors")

hangs for >1min but

curl --head https://huggingface.co/latent-consistency/lcm-lora-sdxl/resolve/main/pytorch_lora_weights.safetensors

resolves immediately (which is expected), then it is not really an issue of huggingface_hub or diffusers. I suspect to be a configuration issue, either of the network or the environment variables used by requests. What's weird though is that it still resolves after >1min :confused:

Wauplin commented 9 months ago

Also it's not normal that the HEAD request is made when loading lora weight if HF_HUB_OFFLINE=1 / local_files_only=True is set. Could you share the stacktrace in this case? This is a separate issue than the "connection hangs forever problem" that has to be fixed in diffusers/huggingface_hub.

rvorias commented 9 months ago

This is the stacktrace from HF_HUB_OFFLINE=1 and local_files_only=True on

pipeline.load_lora_weights(
  "latent-consistency/lcm-lora-sdxl",
  adapter_name="lcm",
  cache_dir=cache_dir,
  local_files_only=True,
)
^CTraceback (most recent call last):
  File "/home/vd/projects/models/scripts/inference/run_inference_engine.py", line 18, in <module>
    with engine as engine:
  File "/home/vd/projects/models/src/inference_engine.py", line 440, in __enter__
    self.pipeline.load_lora_weights(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/diffusers/loaders.py", line 3224, in load_lora_weights
    state_dict, network_alphas = self.lora_state_dict(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/diffusers/loaders.py", line 1325, in lora_state_dict
    weight_name = cls._best_guess_weight_name(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/diffusers/loaders.py", line 1401, in _best_guess_weight_name
    files_in_repo = model_info(pretrained_model_name_or_path_or_dict).siblings
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 1921, in model_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 63, in send
    return super().send(request, *args, **kwargs)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
    resp = conn.urlopen(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
    response = self._make_request(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
    sock = connection.create_connection(
  File "/home/vd/.cache/pypoetry/virtualenvs/models-C365dXtZ-py3.10/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
KeyboardInterrupt

I'm going to try the tcpdump route, will report back any findings

Wauplin commented 9 months ago

Thanks for the stracktrace @rvorias , it made me understand where the offline mode is not respected!

I created an issue on diffusers repo: https://github.com/huggingface/diffusers/issues/6089. Would it be ok to close this issue now? I feel there are 2 problems: 1. your local requests setup => let's hope you find out the problem 2. the fact offline mode is not respected => should be discussed in the other issue. But both are unrelated to huggingface_hub itself.

rvorias commented 9 months ago

Sure! Thanks for the help!

rvorias commented 9 months ago

WireShark showing a lot of TCP retransmissions :thinking: (red boxes in graph below) image Then finally it goes through but still v slow.

Just so weird that it happened since yesterday and I can't recall making changes to my network.

rvorias commented 9 months ago

Damn, no luck trying out a ton of stuff.

Seems my urllib3 sessions are screwed on both my pcs + lots or TCP retransmissions.

Wauplin commented 9 months ago

Maybe worth trying to make a head call with httpx and see if it hangs? It wouldn't be possible to use it directly in huggingace_hub but at least you would know if it's a urllib3 problem are not (since httpx is not based on urllib3 but on httpcore as far as I know).

rvorias commented 9 months ago

Hm, just tried httpx, still the same. Then it's my network?! :crossed_swords:

Update: I feel so silly, had to reset my modem. Seems to work now. :skull: :skull: :skull: :skull: :skull: :skull: :skull: :skull: :skull:

Wauplin commented 9 months ago

Glad to know you solved your issue!