Closed jmliu206 closed 1 year ago
Actually, current codebase can be used for videos of any ratios. I crop the video to 1:2 just coz I want the spatial size of the embedding to be as tiny as 1x2. For example, given a video of 1080x1920, you can use strides list of '5 4 3 2' to get a embed of spatial size 9x16; or you can use strides '5 3 2 2' to get embed of '18x32'. I found for videos without significant dynamic scenes, a tiny embed would give better representation efficiency (i.e., better reconstruction at given parameters).
Thanks for your great work. It seems that you crop the image before inputting them into the network. But the size of the output image is also the size that has been cropped. Is there a way to revert to the size of the original image? The current code seems to be able to process only the video with 1:2 size ratio.