PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.44k stars 68 forks source link

[Script] Images folder convert script to data_info.json #57

Closed frutiemax92 closed 1 month ago

frutiemax92 commented 2 months ago

This script transforms a folder with images and captions to the correct folder structure with the data_info.json file. It also copies the image files to an indexed file name with the same extension as the original in the InternImgs folder. It also supports recursivity i.e. you can put multiple dataset folders in the root folder.

There is also an optional argument --caption_extension which is by default .txt but the user can change it if he wishes.

I thought this would be a useful script as I am more used to the other folder structure.

lawrence-cj commented 2 months ago

Pretty good and useful scripts. Thx a lot. Let's add a how-to-use in the Readme file? @frutiemax92

Radtoo commented 1 month ago

The people that are most likely to train Pixart-Sigma tend to have SDXL structured (image + .txt caption) training data. Such a script should be officially included and documented. Else maybe the functionality needed to be able to use SDXL structured training data could be in train.py?

But I think the empty sharegptv4 values it generates are currently triggering an assertion error.

lawrence-cj commented 1 month ago

really nice work. Thank you so much for your PR.🥰 @frutiemax92