huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.75k stars 5.32k forks source link

Tensor not on the same device for long prompts in `AuraFlowPipeline` #8935

Closed lmxyy closed 3 months ago

lmxyy commented 3 months ago

Describe the bug

When I generate an image with a long prompt in AuraFlowPipeline, it raises this error:

File "~/workspace/anaconda3/envs/diffusers/lib/python3.11/site-packages/diffusers/pipelines/aura_flow/pipeline_aura_flow.py", line 507, in __call__
    ) = self.encode_prompt(
        ^^^^^^^^^^^^^^^^^^^
  File "~/workspace/anaconda3/envs/diffusers/lib/python3.11/site-packages/diffusers/pipelines/aura_flow/pipeline_aura_flow.py", line 267, in encode_prompt
    if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
                                                                     ^^^^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument other in method wrapper_CUDA__equal)

If the prompt is not that long, then it works.

Reproduction

import torch
from diffusers import AuraFlowPipeline

pipeline = AuraFlowPipeline.from_pretrained("fal/AuraFlow", torch_dtype=torch.float16).to("cuda")

image = pipeline(
    prompt="Photography, cinematic, Amazon mythological creature, giving birth, mythology, bonnacon, twinheaded bullsnakehuman, animalistichumanoid creature giving birth to animal, hybrid, half animal, bull head, luscious, shapeshifter, trickster, snake skin, mycelium, mythology, transgender, travesti, shot on portra 160, brazilian, nonbinary, mycelium garments, fantastical, Brazil, dreamy, utopic, transgender, botanical, jungle, beautiful, amazing colors, mythology hybrid creature, wide angle, mythology, folclore, full body, full perspective, soft blue and green colors, purple skin, pastel colors, hybrid, 35mm film, shot on portra 160, mythical hybrid creature, mythology, wings, jungle, louvre, plants, full perspective, Ephemerality, full length, transience, fleeting, ominous, wistful, blowing away, dreamlike, deep perspective, Super  Resolution, Advanced, photography, ultrarealistic, photo realistic, 16k, hyper realistic, cinematic lighting, intricate, realism, maximalist detail, octane render, Artstation, extreme high render ",
    height=1024,
    width=1024,
    num_inference_steps=50,
    generator=torch.Generator().manual_seed(666),
    guidance_scale=3.5,
).images[0]

image.save("tmp.png")

If the prompt is short, like

import torch
from diffusers import AuraFlowPipeline

pipeline = AuraFlowPipeline.from_pretrained("fal/AuraFlow", torch_dtype=torch.float16).to("cuda")

image = pipeline(
    prompt="photo of a cat",
    height=1024,
    width=1024,
    num_inference_steps=50,
    generator=torch.Generator().manual_seed(666),
    guidance_scale=3.5,
).images[0]

then it works.

Logs

File "~/workspace/anaconda3/envs/diffusers/lib/python3.11/site-packages/diffusers/pipelines/aura_flow/pipeline_aura_flow.py", line 507, in __call__
    ) = self.encode_prompt(
        ^^^^^^^^^^^^^^^^^^^
  File "~/workspace/anaconda3/envs/diffusers/lib/python3.11/site-packages/diffusers/pipelines/aura_flow/pipeline_aura_flow.py", line 267, in encode_prompt
    if untruncated_ids.shape[-1] >= text_input_ids.shape[-1] and not torch.equal(
                                                                     ^^^^^^^^^^^^
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument other in method wrapper_CUDA__equal)

System Info

Python 3.11.9
CUDA 12.2
torch 2.3.1
diffusers 0.30.0.dev0
Platform: Ubuntu 20.04.5 LTS

Who can help?

@yiyixuxu

sayakpaul commented 3 months ago

https://github.com/huggingface/diffusers/pull/8937 should fix this.