boschresearch / Divide-and-Bind

Official implementation of "Divide & Bind Your Attention for Improved Generative Semantic Nursing" (BMVC 2023 Oral)
https://sites.google.com/view/divide-and-bind
GNU Affero General Public License v3.0
34 stars 5 forks source link

Loss calculation fails on MPS #7

Closed jwooldridge234 closed 10 months ago

jwooldridge234 commented 10 months ago

Not certain if this is a mac-specific issue, as I don't have a different system I can test on. I'm running pipeline_divide_and_bind_latest.py, and after the first step of the (while target_indicator < target) loop in _perform_iterative_refinement_step it starts producing tensors with NaN values. Ultimately, loss finishes with a value of -inf.

Tried to do some debugging, but I can't figure out the exact part where it fails. It's not the loss calculation (although that complains about an error I've listed below), since the attention store in AttentionStore also returns NaN tensors after the first step and that's what leads to the loss calculation failure.

Here's my code I'm using to call the pipeline:

import torch
from divide_and_bind.pipeline_divide_and_bind_latest import StableDiffusionDivideAndBindPipeline

pipe = StableDiffusionDivideAndBindPipeline.from_pretrained(
    "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16
).to("mps")

prompt = "a purple dog and a green bench on the street,snowy driving scene"

# use get_indices function to find out indices of the tokens you want to alter
#pipe.get_indices(prompt)

token_indices = [3,7]
color_indices = [2,6]
loss_mode = 'tv_bind'
seed = 5555
generator = torch.Generator("mps").manual_seed(seed) #calling this on cpu makes no difference

images = pipe(
    prompt=prompt,
    token_indices=token_indices,
    color_indices=color_indices,
    guidance_scale=7.5,
    generator=generator,
    num_inference_steps=10,
    max_iter_to_alter=5,
    loss_mode=loss_mode
).images

image = images[0]

image.save(f"./images/{prompt}_{seed}.png")

Loss error: UserWarning: Using a target size (torch.Size([15, 16])) that is different to the input size (torch.Size([1, 16])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.

I'm running the latest pytorch nightly (2.3.0.dev20240117) but it also failed on the latest stable.

Please let me know if you see any errors in my implementation.

Many thanks!

YumengLi007 commented 10 months ago

Hi @jwooldridge234 , I don't see obvious errors. Which version of Diffusers are you using? The provided updated pipeline is only tested with Diffusers 0.21.4. Higher than this version may not work.

jwooldridge234 commented 10 months ago

@YumengLi007 Yeah, I just checked, and I'm running diffusers 0.21.4. EDIT: Just tested running it on CPU and it finishes with a loss of -0.08085955679416656, so it's definitely an MPS backend issue.

YumengLi007 commented 10 months ago

Good to know. Thanks for digging into the issue.

jwooldridge234 commented 10 months ago

Did a bit more digging and found the exact line where it fails (129 of the pipeline, in DivideBindAttnProcessor): attention_probs = attn.get_attention_scores(query, key, attention_mask)

Looks like there's a bug with diffusers.models.attention_processor on mps. I'll raise this with them directly, and close this issue. Thanks for being so responsive!

YumengLi007 commented 10 months ago

Hi @jwooldridge234 , just saw this, not sure if switching to torch.float32 could fix the issue 😅

image