LoRA (civitai format) with enable_model_cpu_offload

jpmerc commented 1 year ago

Describe the bug

LoRA (civitai format) with enable_model_cpu_offload option and ControlNet (have not tested with basic Stable Diffusion) does not work correctly. See the code and logs. There is a notebook to reproduce the problem.

Reproduction

Colab notebook available here: https://colab.research.google.com/drive/1j-MEPv6gJyg16QfyjJdL9cE80J7qfSES?usp=sharing

!pip install -q diffusers==0.17.1 transformers xformers git+https://github.com/huggingface/accelerate.git
!pip install -q opencv-contrib-python
!pip install -q controlnet_aux

# Load image
from diffusers import StableDiffusionControlNetPipeline
from diffusers.utils import load_image

image = load_image("https://hf.co/datasets/huggingface/documentation-images/resolve/main/diffusers/input_image_vermeer.png")

# Canny
import cv2
from PIL import Image
import numpy as np

image = np.array(image)

low_threshold = 100
high_threshold = 200

image = cv2.Canny(image, low_threshold, high_threshold)
image = image[:, :, None]
image = np.concatenate([image, image, image], axis=2)
canny_image = Image.fromarray(image)

from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
import torch
from diffusers import UniPCMultistepScheduler

controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16)
pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
pipe.enable_model_cpu_offload()
pipe.enable_xformers_memory_efficient_attention()

!wget https://civitai.com/api/download/models/15603 -O light_and_shadow.safetensors
pipe.load_lora_weights(".", weight_name="light_and_shadow.safetensors")

prompt = "rihanna, best quality, extremely detailed"
negative_prompt = "monochrome, lowres, bad anatomy, worst quality, low quality"
generator = torch.Generator(device="cpu").manual_seed(2)

image = pipe(prompt, canny_image, negative_prompt=negative_prompt, generator=generator, num_inference_steps=20).images[0]

Logs

RuntimeError                              Traceback (most recent call last)
<ipython-input-14-0687f72ed3ff> in <cell line: 5>()
      3 generator = torch.Generator(device="cpu").manual_seed(2)
      4 
----> 5 image = pipe(prompt, canny_image, negative_prompt=negative_prompt,
      6     generator=generator, num_inference_steps=20).images[0]
      7 

19 frames
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/linear.py in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

System Info

diffusers version: 0.17.1
Platform: Linux-5.15.107+-x86_64-with-glibc2.31
Python version: 3.10.12
PyTorch version (GPU?): 2.0.1+cu118 (True)
Huggingface_hub version: 0.16.2
Transformers version: 4.30.2
Accelerate version: 0.21.0.dev0
xFormers version: 0.0.20

Who can help?

@williamberman, @patrickvonplaten, and @sayakpaul

pcuenca commented 1 year ago

Possibly related: #3922.

sayakpaul commented 1 year ago

@jpmerc let's maybe follow #3922 for this as Pedro mentioned.

patrickvonplaten commented 1 year ago

Let's try to fix this in one go with #3922 - Sayak would you be interested in giving this a try?

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers