huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
26.44k stars 5.45k forks source link

playground cuda error #8523

Open Honey-666 opened 5 months ago

Honey-666 commented 5 months ago

https://github.com/huggingface/diffusers/blob/896fb6d8d7c10001eb2a92568be7b4bd3d5ddea3/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py#L728

When I initialize without using to('cuda'), the model exists in the cpu, so here self.device gets the handle on the cpu, refer to the code above, whether we change it to device

error msg:

  init_latents = (init_latents - latents_mean) * self.vae.config.scaling_factor / latents_std
  RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

os:

diffusers==diffusers-0.28.2

this is my code:

import torch
from PIL import Image
from diffusers import StableDiffusionXLImg2ImgPipeline, EDMDPMSolverMultistepScheduler

sd_xl_path = '../models/Stable-diffusion/playground-v2.5-xl-aesthetic'
img_path = './image_1.png'

pipe = StableDiffusionXLImg2ImgPipeline.from_pretrained(sd_xl_path,
                                                        torch_dtype=torch.float16,
                                                        add_watermarker=False,
                                                        )

pipe.enable_model_cpu_offload()
pipe.scheduler = EDMDPMSolverMultistepScheduler.from_config(pipe.scheduler.config, use_karras_sigmas=True)

image = Image.open(img_path)

images = pipe(prompt='a beautiful girl',
              image=image,
              guidance_scale=3.0,
              num_inference_steps=30,
              strength=0.5
              ).images
print(images)
tolgacangoz commented 5 months ago

Could you just upgrade diffusers to the latest stable version?

yiyixuxu commented 5 months ago

hi @Honey-666

I think it should be device instead of self.device here - do you want to give it a try and if it works, maybe open a PR for us? 🤗

 latents_mean = latents_mean.to(device=self.device, dtype=dtype) 
Honey-666 commented 5 months ago

Could you just upgrade diffusers to the latest stable version?

I tried to upgrade to the latest version and he had the same problem

Honey-666 commented 5 months ago

hi @Honey-666

I think it should be device instead of self.device here - do you want to give it a try and if it works, maybe open a PR for us? 🤗

 latents_mean = latents_mean.to(device=self.device, dtype=dtype) 

Yes I have tried this modification and it works fine. yeah, I'd love to do that

tolgacangoz commented 5 months ago

Where does playground-v2.5-xl-aesthetic come from? I couldn't reproduce with playgroundai/playground-v2.5-1024px-aesthetic.

- 🤗 Diffusers version: 0.28.2
- Platform: Ubuntu 20.04.6 LTS - Linux-5.15.133+-x86_64-with-glibc2.31
- Running on Kaggle Notebook
- Python version: 3.10.13
- PyTorch version (GPU?): 2.1.2 (True)
- Flax version (CPU?/GPU?/TPU?): 0.8.4 (gpu)
- Jax version: 0.4.26
- JaxLib version: 0.4.26.dev20240504
- Huggingface_hub version: 0.23.2
- Transformers version: 4.41.2
- Accelerate version: 0.30.1
- PEFT version: not installed
- Bitsandbytes version: not installed
- Safetensors version: 0.4.3
- xFormers version: not installed
- Accelerator: Tesla P100-PCIE-16GB, 16384 MiB VRAM
- Using GPU in script?
- Using distributed or parallel set-up in script?
haofanwang commented 5 months ago

Agree with @tolgacangoz, cannot reproduce this error with playgroundai/playground-v2.5-1024px-aesthetic.

Honey-666 commented 5 months ago

Agree with @tolgacangoz, cannot reproduce this error with playgroundai/playground-v2.5-1024px-aesthetic.

of course I don't always get to reproduce it, but it always comes when I don't expect it debug_image

tolgacangoz commented 5 months ago

Could you just upgrade diffusers to the latest stable version?

When I said this yesterday, I reproduced with diffusers==0.28.2 but couldn't with diffusers==0.29.0; then I directly said "Could you just upgrade it?". But now, I am able to reproduce this issue with both versions :thinking:.

yiyixuxu commented 5 months ago

well we should not use self.device here anyway, which does not work with offloading
self._execution_device does, i.e. the device argument passed to this function https://github.com/huggingface/diffusers/blob/25d7bb3ea65bb6f79056252584e24eb545c5eb18/src/diffusers/pipelines/pipeline_utils.py#L985

sayakpaul commented 5 months ago

Is this fixed, @yiyixuxu?

Honey-666 commented 4 months ago

It seems to have the same problem here, https://github.com/huggingface/diffusers/blob/e15a8e7f17529ebf4029eed07f982e14c7560546/src/diffusers/loaders/ip_adapter.py#L215

error msg:

`RuntimeError: Input type (torch.cuda.HalfTensor) and weight type (torch.HalfTensor) should be the same`
sayakpaul commented 4 months ago

Gently pinging @yiyixuxu here.

yiyixuxu commented 3 months ago

@Honey-666 I think the one with the load_ip_adapter is not so much of a problem because you need to run that before enable_model_cpu_offload anyway

Honey-666 commented 3 months ago

@Honey-666 I think the one with the load_ip_adapter is not so much of a problem because you need to run that before enable_model_cpu_offload anyway

Thank you. Yes, I put the load_ip_adapter in front of the enable_model_cpu_offload and it works fine But then I have to load image_encoder when I initialize pipe. Assuming I initialize pipe without loading it, if I decide that ip_adapter_image is not empty in the __call__ argument, Loading the load_ip_adapter will cause the above error

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.