Picsart-AI-Research / Text2Video-Zero

[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators
https://text2video-zero.github.io/
Other
4k stars 345 forks source link

the first generated image of each chuck is a computational waste #57

Open joaomede opened 1 year ago

joaomede commented 1 year ago

https://github.com/Picsart-AI-Research/Text2Video-Zero/blob/fd76734e06aadb1dee83d2b7a368b14bdaa35565/model.py#L123

currently the first image of each chunk processing is wasted, discarded. Would it be possible to avoid processing it? because in a scenario where several chuck's will be processed, the first image will always be identical, it does not apply scene changes, only from the second onwards, it would be interesting to avoid the procedural cost of that first image, if possible.