PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.44k stars 67 forks source link

support for webdataset format #119

Closed jsg921019 closed 6 days ago

jsg921019 commented 1 week ago

Thanks for the great work. I have a question about the dataset format. I noticed in the training script, the data is loaded from json file. Have you tried loading from the webdaset format (tar files)? How did you handle ~10M scale dataset?

lawrence-cj commented 6 days ago

We have an internally released version supporting webdataset format data. What do you mean by how to handle 10M scaled dataset?

jsg921019 commented 6 days ago

I meant how do you train with data that is large as 10M. So I assume you have trained with webdataset with internally realeased code? I have another question. Is there any specific reason why grad clip is set very low? (0.01)

lawrence-cj commented 6 days ago

So I assume you have trained with webdataset with internally realeased code?

We train the released specific dataset and the webdataset both. Got similar results.

I have another question. Is there any specific reason why grad clip is set very low? (0.01)

For someone who got unstable results (NaN) during training. If you don't have the issue, feel free to enlarge it to 0.1 or 1. Both are ok.