Maybe low_cpu_mem_usage need to be add as a parameter in load_lora_weight of StableDiffusionXLLoraLoaderMixin

Meloneat commented 5 months ago

What API design would you like to have changed or added to the library? Why?

l guess maybe the param low_cpu_mem_usage need to be added in load_lora_weight method of StableDiffusionXLLoraLoaderMixin.

What use case would this enable or better enable? Can you give us a code example?

In my pro. l come to a mixed case between stabilityai/stable-diffusion-xl-refiner-1.0 and lcm-lora-sdxl, so l choose the pipeline 'StableDiffusionXLImg2ImgPipeline' wth the inhert method 'load_lora_weights' to load the function of img2img.But it failed while
my torch version >= 1.9.0 and low_cpu_mem_usage's default value is True. When l set low_cpu_mem_usage to be False, it didn't worked.

like this:

pipe = StableDiffusionXLImg2ImgPipeline.from_single_file(
      model_id, # the way to stable-diffusion-xl-refiner-1.0 
      torch_dtype=torch.float16,
      variant="fp16"
  ).to("cuda")

  pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)

  pipe.load_lora_weights(
      lcm_lora_id,
      low_cpu_mem_usage=False,
      ignore_mismatched_sizes=True)

  init_image = load_image(img_orgin)

  image = pipe(
      prompt=prompt_orgin,
      image=init_image,
      num_inference_steps=4,
      guidance_scale=1,
      strength=0.6
  ).images[0]

DN6 commented 5 months ago

cc: @sayakpaul for visibility

sayakpaul commented 5 months ago

Can I see the error trace?

yiyixuxu commented 4 months ago

@sayakpaul should we pass down low_cpu_mem_usage arguments here? https://github.com/huggingface/diffusers/blob/3a7e481611bc299416aaeed4207086d9ddca5852/src/diffusers/loaders/lora.py#L1243

sayakpaul commented 4 months ago

@yiyixuxu good catch. Need to discuss how it should be handled with peft. @younesbelkada could you advise here?

yiyixuxu commented 4 months ago

@sayakpaul this one might be related too https://github.com/huggingface/diffusers/issues/6560

sayakpaul commented 4 months ago

May not be. We don't have a snippet there.

younesbelkada commented 4 months ago

Hey everyone ! per my understanding currently low_cpu_mem_usage gets silently ignored in case of PEFT backend as when working on the integration we found out that using PEFT was as fast as old-diffusers with low_cpu_mem_usage (I need to find again the benchmarks we did)

We could indeed try to implement low_cpu_mem_usage support for peft backend, IMO it would simply require to add the init_empty_weights context manager when injecting the PEFT adapters (similarly as: https://github.com/huggingface/diffusers/blob/3a7e481611bc299416aaeed4207086d9ddca5852/src/diffusers/loaders/unet.py#L271) but we'll need to benchmark the benefit of that feature compared to the current approach.

I second what @sayakpaul said regarding #6560 I think this might be another issue (possibly PEFT ?) - happy to have a look once we have a reproducible snippet and if it is PEFT-related :D !

Let me know how does everything sounds!

sayakpaul commented 4 months ago

Sounds good Younes! Let’s try to have a benchmark to take a decision here. Otherwise, there’s no need to add it IMO.

yiyixuxu commented 4 months ago

@sayakpaul @younesbelkada thanks for looking into this!!

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

sayakpaul commented 3 months ago

I guess this is not stale. @younesbelkada?

younesbelkada commented 3 months ago

yes indeed ! I don"t have the bandwith to look into this :/ also I am not sure about the priority of this task, what do you think ?

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / diffusers

Maybe low_cpu_mem_usage need to be add as a parameter in load_lora_weight of StableDiffusionXLLoraLoaderMixin #6720