PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.44k stars 68 forks source link

Diffusers model loading Issues #76

Closed nbardy closed 1 month ago

nbardy commented 2 months ago

Trying to run inference and getting an issue with loading the model via diffusers:

T5EncoderModel
Traceback (most recent call last):
  File "test_d.py", line 19, in <module>
    pipe.to(device)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/diffusers/pipelines/pipeline_utils.py", line 464, in to
    module.to(device, dtype)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2692, in to
    return super().to(*args, **kwargs)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1152, in to
    return self._apply(convert)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 802, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 825, in _apply
    param_applied = fn(param)
  File "/home/paperspace/miniconda3/envs/pytorch_gpu/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1150, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!

From current main diffusers:

Version:

(pytorch_gpu) paperspace@psy0glj6t:~/git/diffusers/examples/text_to_image$ pip freeze | grep diffusers
diffusers @ git+https://github.com/huggingface/diffusers@9d16daaf640462a0580dd1d503e71d246809a09a
nbardy commented 2 months ago

Found a few more issues trying to workaround.

Had to add a few patches to diffusers and now getting an image but it's not converging

Here is my current script:

import torch
from diffusers import Transformer2DModel, PixArtSigmaPipeline
from transformers import T5EncoderModel

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
weight_dtype = torch.float16

encoder = T5EncoderModel.from_pretrained(
    "PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers",
    subfolder="text_encoder",
    torch_dtype=weight_dtype,
    use_safetensors=True,
)

transformer = Transformer2DModel.from_pretrained(
    "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS",
    subfolder="transformer",
    torch_dtype=weight_dtype,
    use_safetensors=True,
)
pipe = PixArtSigmaPipeline.from_pretrained(
    "PixArt-alpha/pixart_sigma_sdxlvae_T5_diffusers",
    transformer=transformer,
    text_encoder=encoder,
    torch_dtype=weight_dtype,
    use_safetensors=True,
)
pipe.to(device)

# Enable memory optimizations.
# pipe.enable_model_cpu_offload()

prompt = "A small cactus with a happy face in the Sahara desert."
with torch.cuda.amp.autocast():  # Enable mixed precision to optimize performance
    image = pipe(prompt).images[0]
image.save("./cactus.png")

image

lawrence-cj commented 2 months ago

Remove this line:

with torch.cuda.amp.autocast():  # Enable mixed precision to optimize performance