Parallel Feature Extraction

PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

https://pixart-alpha.github.io/PixArt-sigma-project/

GNU Affero General Public License v3.0

1.63k stars 77 forks source link

Parallel Feature Extraction #115

Closed alfredplpl closed 3 months ago

alfredplpl commented 3 months ago

This is an amazing project. Even I can create image generation from scratch.

However, the problem I'm currently facing is that feature extraction using the T5 encoder cannot be done in parallel. To extract features from 1.2M images, parallel extraction is essential. But with the current code, extraction can only be done with a single GPU.

How should I address this issue? Please advise.

lawrence-cj commented 3 months ago

We have start_index and end_index in the extracting script, which is used to run several commands at the same time in a .sh file. Same as the parallel you said.

https://github.com/PixArt-alpha/PixArt-sigma/blob/30c3297259da6544295168c14a67500f35397ab3/tools/extract_features.py#L307

alfredplpl commented 3 months ago

Thank you! I do so.