currently the first image of each chunk processing is wasted, discarded. Would it be possible to avoid processing it? because in a scenario where several chuck's will be processed, the first image will always be identical, it does not apply scene changes, only from the second onwards, it would be interesting to avoid the procedural cost of that first image, if possible.
https://github.com/Picsart-AI-Research/Text2Video-Zero/blob/fd76734e06aadb1dee83d2b7a368b14bdaa35565/model.py#L123
currently the first image of each chunk processing is wasted, discarded. Would it be possible to avoid processing it? because in a scenario where several chuck's will be processed, the first image will always be identical, it does not apply scene changes, only from the second onwards, it would be interesting to avoid the procedural cost of that first image, if possible.