omriav / blended-latent-diffusion

Official implementation for "Blended Latent Diffusion" [SIGGRAPH 2023]
https://omriavrahami.com/blended-latent-diffusion-page/
MIT License
569 stars 34 forks source link

Mismatched tensors when using stable diffusion implementation #9

Closed JianingF closed 1 year ago

JianingF commented 1 year ago

When running the code with text_editing_stable_diffusion.py, I am getting the following error when using the sample image and mask: FutureWarning: Accessing config attribute in_channels directly via 'UNet2DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet2DConditionModel's config object instead, e.g. 'unet.config.in_channels'. (batch_size, self.unet.in_channels, height // 8, width // 8), Traceback (most recent call last): File "/nfshomes/jianing/project_files/blended-latent-diffusion/scripts/text_editing_stable_diffusion.py", line 167, in results = bld.edit_image( ^^^^^^^^^^^^^^^ File "/nfshomes/jianing/CAAR/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/nfshomes/jianing/project_files/blended-latent-diffusion/scripts/text_editing_stable_diffusion.py", line 127, in edit_image noise_source_latents = self.scheduler.add_noise( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nfshomes/jianing/CAAR/lib/python3.11/site-packages/diffusers/schedulers/scheduling_ddim.py", line 468, in add_noise noisy_samples = sqrt_alpha_prod original_samples + sqrt_one_minus_alpha_prod * noise


RuntimeError: The size of tensor a (84) must match the size of tensor b (64) at non-singleton dimension 3
omriav commented 1 year ago

Hi,

Thanks for sharing. I think that the problem is that you used image resolution that is not compatible with Stable Diffusion (512x512 in case of the base model). I added an explicit resize in the code for the future, so you can pull the new changes.

Thanks.

JianingF commented 1 year ago

I changed the image dimensions and there is no problem now. Thanks!