Dear authors, thanks for your awesome work. I want to ask whether the model support joint training on images and videos? as it is very important to improve the quality of video generation. Also, does the model support compression in temporal dimension?
Dear authors, thanks for your awesome work. I want to ask whether the model support joint training on images and videos? as it is very important to improve the quality of video generation. Also, does the model support compression in temporal dimension?