PixArt-alpha / PixArt-sigma

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
https://pixart-alpha.github.io/PixArt-sigma-project/
GNU Affero General Public License v3.0
1.47k stars 70 forks source link

Can this model directly use to generate video? Or we need train from start? #130

Open foreverpiano opened 1 week ago

foreverpiano commented 1 week ago

@lawrence-cj

lawrence-cj commented 1 week ago

You can start to train a video generation model based on Sigma. Some of existing works are doing so.

foreverpiano commented 1 week ago

Like Open-Sora by adding time dimension? the current version is still 1dverison,

lawrence-cj commented 1 week ago

Yes. what else do you plan to do except the 1d temporal dimension?

foreverpiano commented 1 week ago

I don't really understand how to transform 1d to 2d easily. So is the major thing to add time embedding in fintuning?