PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.47k stars 70 forks source link

Unexpected results with `diffusers` #29

Closed chenxwh closed 2 months ago

chenxwh commented 3 months ago

Hi, thanks for the great work and integration to diffusers!

However when trying exactly the snippets provided here, I got strange results below. Any idea what might have gone wrong?

catcus

Thank you!

Beinsezii commented 3 months ago

The official diffusers converted weights are broken. Use something like niklasku/PixArt-Sigma-XL-2-1024-MS instead.

lawrence-cj commented 3 months ago

Did you upgrade the diffusers with pip install git+https://github.com/huggingface/diffusers?

Beinsezii commented 3 months ago

"PixArt-alpha/PixArt-Sigma-XL-2-1024-MS" as the transformer 00243

"niklasku/PixArt-Sigma-XL-2-1024-MS" as the transformer, identical parameters otherwise 00242

I just pulled diffusers/main 5 minutes ago to make these images.

Using ./scripts/diffusers_patches.py with no modifications to the pipeline.

lawrence-cj commented 3 months ago

It's strange. If you load the file in "niklasku/PixArt-Sigma-XL-2-1024-MS

from safetensors.torch import load_file
d = load_file('PixArt-Sigma-XL-2-1024-MS/diffusion_pytorch_model.safetensors')

There are plenty of layers are just all zeros? image

Beinsezii commented 3 months ago

Padding?

lawrence-cj commented 3 months ago

I re-upload a new safetensor file upto the hugginface: https://huggingface.co/PixArt-alpha/PixArt-Sigma-XL-2-1024-MS/tree/main. You can give it a try.

Beinsezii commented 3 months ago

Yes that one works. @chenxwh Should try again.

What ended up being the cause?

It might be worth investigating the x512 diffusers as well. My colleague had said he was having issues with them too. Maybe needs the same fix?

lawrence-cj commented 3 months ago

Some layers in the previous ckpt are broken. I have re-uploaded all the 256px, 512px and 1024px ckpt. Let me know if the 512 ckpt is also working well now.