Way to future generate frames(infer) conditioned one or a few initial frames of a video sequence.

songweige / TATS

Official PyTorch implementation of TATS: A Long Video Generation Framework with Time-Agnostic VQGAN and Time-Sensitive Transformer (ECCV 2022)

MIT License

263 stars 17 forks source link

Way to future generate frames(infer) conditioned one or a few initial frames of a video sequence. #18

Closed horn-video closed 1 year ago

horn-video commented 1 year ago

Hi, First of all, congratulation for this work! My query is simple, take for example UCF101 dataset. If I randomly select a sequence from CleanAndJerk category. Can I generate future frames for this specific sequence using your model? If so, what changes should I be making in the `sample_vqgan_transformer_short_videos.py' file. Appreciate your time and help :)

songweige commented 1 year ago

Hi,

Thank you for your question! That's actually possible and interesting but I have never tried! I think it should be more straightforward to do by modifying the sample_vqgan_transformer_long_videos.py file.

So what you may need to do is replace the first few generated latent frames with the latent of real frames obtained by the encoder here: https://github.com/SongweiGe/TATS/blob/8ea1b587a74736d420b70cc2b52ac1683682ec6c/scripts/sample_vqgan_transformer_long_videos.py#L126

horn-video commented 1 year ago

Thanks a lot! The suggestion worked