huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.35k stars 5.25k forks source link

Loading model and LoRA in Python causes tensor shape mismatch #7770

Open levoz92 opened 5 months ago

levoz92 commented 5 months ago

Describe the bug

Hi.

Here, I am trying to do inpaint using realisticStockPhoto with SDXL_FILM_PHOTOGRAPHY_STYLE LoRA.

device = "cuda"
model_path = "realisticStockPhoto_v20.safetensors"

pipe = StableDiffusionInpaintPipeline.from_single_file(model_path,torch_dtype=torch.float16,safetensors=True).to(device)
pipeline = AutoPipelineForText2Image.from_single_file("SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors", torch_dtype=torch.float16).to(device)

I have already tested out this model and LoRA on gradio and very satisfied with the results. However, this is the bug that I am getting:

ValueError                                Traceback (most recent call last)
[<ipython-input-11-ec0fbfd88bf5>](https://localhost:8080/#) in <cell line: 5>()
      3 model_path = "realisticStockPhoto_v20.safetensors"
      4 
----> 5 pipe = StableDiffusionInpaintPipeline.from_single_file(model_path,torch_dtype=torch.float16,safetensors=True).to(device)
      6 pipeline = AutoPipelineForText2Image.from_single_file("SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors", torch_dtype=torch.float16).to(device)

4 frames
[/usr/local/lib/python3.10/dist-packages/diffusers/models/modeling_utils.py](https://localhost:8080/#) in load_model_dict_into_meta(model, state_dict, device, dtype, model_name_or_path)
    150         if empty_state_dict[param_name].shape != param.shape:
    151             model_name_or_path_str = f"{model_name_or_path} " if model_name_or_path is not None else ""
--> 152             raise ValueError(
    153                 f"Cannot load {model_name_or_path_str}because {param_name} expected shape {empty_state_dict[param_name]}, but got {param.shape}. If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example."
    154             )

ValueError: Cannot load because conv_in.weight expected shape tensor(..., device='meta', size=(320, 9, 3, 3)), but got torch.Size([320, 4, 3, 3]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.

Where am I going wrong?

Reproduction

The error is suggesting two parameters low_cpu_mem_usage and ignore_mismatcheD_sizes. I did use these parameters but it made no difference for the errors.

Logs

No response

System Info

Colab

Python 3.10.12
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.3 LTS
Release:    22.04
Codename:   jammy
Wed Apr 24 18:19:59 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA L4                      Off | 00000000:00:03.0 Off |                    0 |
| N/A   33C    P8              11W /  72W |      1MiB / 23034MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Tue_Aug_15_22:02:13_PDT_2023
Cuda compilation tools, release 12.2, V12.2.140
Build cuda_12.2.r12.2/compiler.33191640_0

Who can help?

No response

asomoza commented 5 months ago

Hi, this is produced because you're using from_single_file, StableDiffusionInpaintPipeline and a normal model, you'll need to add num_in_channels=4

pipeline = StableDiffusionInpaintPipeline.from_single_file(
    model_path, 
    torch_dtype=torch.float16, 
    num_in_channels=4).to("cuda")

You can read more about it in this discussion https://github.com/huggingface/diffusers/discussions/7163#discussioncomment-8661046

Also you're not loading the lora correctly, it should be like this:

pipeline.load_lora_weights(".", weight_name="SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors", adapter_name="film")
levoz92 commented 5 months ago

@asomoza Getting this error now:

TypeError: StableDiffusionInpaintPipeline.__init__() got an unexpected keyword argument 'tokenizer_2'

This is my updated code:

device = "cuda"
model_path = "weights/realisticStockPhoto_v20.safetensors"

pipe = StableDiffusionInpaintPipeline.from_single_file(
    model_path, 
    torch_dtype=torch.float16, 
    num_in_channels=4).to("cuda")

pipe.load_lora_weights(".", weight_name="weights/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors", adapter_name="film")
asomoza commented 5 months ago

oh I missed that, you're using a SDXL model, you'll need to use the StableDiffusionXLInpaintPipeline

device = "cuda"
model_path = "weights/realisticStockPhoto_v20.safetensors"

pipe = StableDiffusionXLInpaintPipeline.from_single_file(
    model_path, 
    torch_dtype=torch.float16, 
    num_in_channels=4).to("cuda")

pipe.load_lora_weights(".", weight_name="weights/SDXL_FILM_PHOTOGRAPHY_STYLE_BetaV0.4.safetensors", adapter_name="film")
levoz92 commented 5 months ago

Thanks. It works.

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.