PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.66k stars 82 forks source link

Training failed? #142

Closed alfredplpl closed 1 month ago

alfredplpl commented 2 months ago

Thank you for your great code!

Currently, I'm trying to create a 0.6B Pixart-Sigma model from full scratch using 30M images. After training for several hundred thousand steps, cats still don't maintain their shape. What could be the cause of this? Possible reasons include insufficient data, insufficient parameters, or lack of text encoder capability. Which do you think it might be? tmp89xb0qug

Feynman1999 commented 2 months ago

Thank you for your great code!

Currently, I'm trying to create a 0.6B Pixart-Sigma model from full scratch using 30M images. After training for several hundred thousand steps, cats still don't maintain their shape. What could be the cause of this? Possible reasons include insufficient data, insufficient parameters, or lack of text encoder capability. Which do you think it might be? tmp89xb0qug

Hello, may I ask which training script you are using? I have tried training from scratch before, but due to limited resources, I gave up

alfredplpl commented 2 months ago

@Feynman1999 I fixed the training code because of the limited resources. For example, I use L4 x32 for training because I do not get A100 or H100. The point is as follows:

We can load the model on L4 which has 24 GB VRAM.

alfredplpl commented 2 months ago

I continue training the model. It seems that the cat in the inference result has a eye.

image

Feynman1999 commented 2 months ago

I continue training the model. It seems that the cat in the inference result has a eye.

image

Perhaps you can try fine-tuning the official model and training it on your dataset. I think the initial results were good, but if there were obvious cracks later on, it should be due to a bug in the training code (such as precision overflow causing gradient anomalies, data loading errors, etc.)

alfredplpl commented 2 months ago

I enhanced the precision. Then, I got the high quality images.

image

alfredplpl commented 1 month ago

image

I succeeded the training. I tell you the training in detail. Thank you.