Closed alfredplpl closed 1 month ago
Thank you for your great code!
Currently, I'm trying to create a 0.6B Pixart-Sigma model from full scratch using 30M images. After training for several hundred thousand steps, cats still don't maintain their shape. What could be the cause of this? Possible reasons include insufficient data, insufficient parameters, or lack of text encoder capability. Which do you think it might be?
Hello, may I ask which training script you are using? I have tried training from scratch before, but due to limited resources, I gave up
@Feynman1999 I fixed the training code because of the limited resources. For example, I use L4 x32 for training because I do not get A100 or H100. The point is as follows:
We can load the model on L4 which has 24 GB VRAM.
I continue training the model. It seems that the cat in the inference result has a eye.
I continue training the model. It seems that the cat in the inference result has a eye.
Perhaps you can try fine-tuning the official model and training it on your dataset. I think the initial results were good, but if there were obvious cracks later on, it should be due to a bug in the training code (such as precision overflow causing gradient anomalies, data loading errors, etc.)
I enhanced the precision. Then, I got the high quality images.
I succeeded the training. I tell you the training in detail. Thank you.
Thank you for your great code!
Currently, I'm trying to create a 0.6B Pixart-Sigma model from full scratch using 30M images. After training for several hundred thousand steps, cats still don't maintain their shape. What could be the cause of this? Possible reasons include insufficient data, insufficient parameters, or lack of text encoder capability. Which do you think it might be?