The load_lora_weights is not working offline

rupeshs commented 10 months ago

Describe the bug

The load_lora_weights is not working offline.

Reproduction

Sample code to reproduce this issue. (Turn off the internet)

from diffusers import DiffusionPipeline, LCMScheduler

pipeline = DiffusionPipeline.from_pretrained(
    "Lykon/dreamshaper-8",
    local_files_only=True,
)
pipeline.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")

Logs

Loading pipeline components...: 100%|████████████████████████████████████████████████████| 7/7 [00:00<00:00, 13.51it/s]
Traceback (most recent call last):
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connection.py", line 203, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\util\connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Rupesh\AppData\Local\Programs\Python\Python311\Lib\socket.py", line 961, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno 11001] getaddrinfo failed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connectionpool.py", line 790, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connectionpool.py", line 491, in _make_request
    raise new_e
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connectionpool.py", line 467, in _make_request
    self._validate_conn(conn)
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connectionpool.py", line 1096, in _validate_conn
    conn.connect()
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connection.py", line 611, in connect
    self.sock = sock = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connection.py", line 210, in _new_conn
    raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x00000198E0086790>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\connectionpool.py", line 844, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\urllib3\util\retry.py", line 515, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/latent-consistency/lcm-lora-sdv1-5 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x00000198E0086790>: Failed to resolve 'huggingface.co' ([Errno 11001] getaddrinfo failed)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:\dev\push\faster\unify\fastsdcpu\lora_test.py", line 7, in <module>
    pipeline.load_lora_weights("latent-consistency/lcm-lora-sdv1-5")
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\diffusers\loaders.py", line 1200, in load_lora_weights
    state_dict, network_alphas = self.lora_state_dict(pretrained_model_name_or_path_or_dict, **kwargs)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\diffusers\loaders.py", line 1351, in lora_state_dict
    weight_name = cls._best_guess_weight_name(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\diffusers\loaders.py", line 1401, in _best_guess_weight_name
    files_in_repo = model_info(pretrained_model_name_or_path_or_dict).siblings
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\huggingface_hub\utils\_validators.py", line 118, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\huggingface_hub\hf_api.py", line 1697, in model_info
    r = get_session().get(path, headers=headers, timeout=timeout, params=params)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\requests\sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\huggingface_hub\utils\_http.py", line 63, in send
    return super().send(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "F:\dev\push\faster\unify\fastsdcpu\env\Lib\site-packages\requests\adapters.py", line 519, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError('HTTPSConnectionPool(host=\'huggingface.co\', port=443): Max retries exceeded with url: /api/models/latent-consistency/lcm-lora-sdv1-5 (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x00000198E0086790>: Failed to resolve \'huggingface.co\' ([Errno 11001] getaddrinfo failed)"))'), '(Request ID: 50f553fe-20fc-4790-a449-39f185542a17)')

System Info

diffusers version: 0.23.0
Platform: Windows-10-10.0.22631-SP0
Python version: 3.11.0
PyTorch version (GPU?): 2.0.1+cpu (False)
Huggingface_hub version: 0.17.3
Transformers version: 4.35.0
Accelerate version: 0.23.0
xFormers version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@sayakpaul @patrickvonplaten

sayakpaul commented 10 months ago

Please download the weights from here first: latent-consistency/lcm-lora-sdv1-5. Put that in a directory and then run your code:

from diffusers import DiffusionPipeline, LCMScheduler

pipeline = DiffusionPipeline.from_pretrained(
    "Lykon/dreamshaper-8",
    local_files_only=True,
)
pipeline.load_lora_weights("your-dir-where-the-weights-are-located")

rupeshs commented 10 months ago

@sayakpaul This model already cached still not working. seems like lora_state_dict is not considering local_files_only argument. Tried this one also pipeline.load_lora_weights("latent-consistency/lcm-lora-sdv1-5", local_files_only=True)

sayakpaul commented 10 months ago

Is the model cached in latent-consistency/lcm-lora-sdv1-5?

rupeshs commented 10 months ago

yes

sayakpaul commented 10 months ago

When you specify latent-consistency/lcm-lora-sdv1-5, the internal norm is to look for a location on the Hugging Face Hub.

This is why, I think you should:

Download the desired weight file from latent-consistency/lcm-lora-sdv1-5.
Put it in a directory, say lora-lcm.
And do: pipeline.load_lora_weights("lora-lcm").

rupeshs commented 10 months ago

@sayakpaul Then I'm wondering what is the purpose of local_files_only argument with load_lora_weights function? It should load from the cached folder weights(Not manually downloaded), right?

rupeshs commented 10 months ago

Please note that DiffusionPipeline is working fine with local_files_only argument and it can load already cached weights. It seems like there is some problem with load_lora_weights.

sayakpaul commented 10 months ago

Hmm, I am unable to reproduce this.

This is what I did.

I downloaded a LoRA checkpoint like so (with internet turned on):

from huggingface_hub import hf_hub_download

repo_id = "sayakpaul/new-lora-check-v15"
lora_id = "pytorch_lora_weights.safetensors"
ckpt_path = hf_hub_download(repo_id=repo_id, filename=lora_id)

I, then turned my internet off and ran:

from huggingface_hub import hf_hub_download

repo_id = "sayakpaul/new-lora-check-v15"
lora_id = "pytorch_lora_weights.safetensors"
ckpt_path = hf_hub_download(repo_id=repo_id, filename=lora_id, local_files_only=True)

It worked fine.

I am showing hf_hub_download because that is what we use inside of load_lora_weights(). Relevant call sites (ordered):

Cc: @Wauplin. Anything I am missing here?

rupeshs commented 10 months ago

@sayakpaul I just integrated diffusers with FastSD CPU, while testing I found this issue.

Current status of offline workflows with FastSD CPU. LCM - Working (Diffusion pipeline) LCM LoRA - Not working (Diffusion pipeline +Load lora ) LCM OpenVINO - Working (OV pipeline)

Refer : https://github.com/rupeshs/fastsdcpu/blob/main/src/backend/pipelines/lcm_lora.py

rupeshs commented 10 months ago

Hmm, I am unable to reproduce this.

This is what I did.

I downloaded a LoRA checkpoint like so (with internet turned on):
from huggingface_hub import hf_hub_download

repo_id = "sayakpaul/new-lora-check-v15"
lora_id = "pytorch_lora_weights.safetensors"
ckpt_path = hf_hub_download(repo_id=repo_id, filename=lora_id)
I, then turned my internet off and ran:
from huggingface_hub import hf_hub_download

repo_id = "sayakpaul/new-lora-check-v15"
lora_id = "pytorch_lora_weights.safetensors"
ckpt_path = hf_hub_download(repo_id=repo_id, filename=lora_id, local_files_only=True)
It worked fine.

I am showing hf_hub_download because that is what we use inside of load_lora_weights(). Relevant call sites (ordered):

https://github.com/huggingface/diffusers/blob/2a111bc9febb6121bc270830c0afa302b3337490/src/diffusers/loaders/lora.py#L105

https://github.com/huggingface/diffusers/blob/2a111bc9febb6121bc270830c0afa302b3337490/src/diffusers/loaders/lora.py#L234

https://github.com/huggingface/diffusers/blob/2a111bc9febb6121bc270830c0afa302b3337490/src/diffusers/utils/hub_utils.py#L283

Cc: @Wauplin. Anything I am missing here?

Can you try with exact code I have attached with issue?

sayakpaul commented 10 months ago

Can you try with exact code I have attached with issue?

Yeah tried with local_files_only specified to True for load_lora_weights(). Didn't work without internet connection.

Wauplin commented 10 months ago

I haven't investigate more but looks like a duplicate of #6089 no?

spezialspezial commented 9 months ago

Maybe try:

kwargs = {"local_files_only": True, "weight_name": "pytorch_lora_weights.safetensors"} pipe.load_lora_weights("latent-consistency/lcm-lora-sdv1-5", **kwargs)

This should pick up a previously downloaded lcm-lora from the local disk hub cache while being offline, i.e. HF_HUB_OFFLINE or guarded sockets (HF_HUB_OFFLINE is ignored here and there, ahem). At least works here. Note the variable name 'weight_name' instead of 'filename'

Wauplin commented 9 months ago

HF_HUB_OFFLINE is ignored here and there, ahem

Yes indeed. A PR to fix this is in progress: https://github.com/huggingface/huggingface_hub/pull/1899. This way it ensure any calls are explicitly blocked.

sayakpaul commented 9 months ago

@rupeshs could you update your installations of huggingface_hub and diffusers to be from source (i.e., install them from the main branch) and see if the messages you're seeing are better and are helping you resolve the problem?

rupeshs commented 9 months ago

@sayakpaul yes tried.

sayakpaul commented 9 months ago

That is exactly expected here. You must specify the weight name as indicated in the error message.

rupeshs commented 9 months ago

Thanks for the proper error message.

sayakpaul commented 9 months ago

Feel free to close the issue if you feel like so.

rupeshs commented 9 months ago

Thanks @sayakpaul for the great support.

huggingface / diffusers