mattiasxu / Video-VQVAE

VQVAE for video prediction
MIT License
26 stars 7 forks source link

Control the number of images output by the VQVAE model #1

Closed Scienceseb closed 2 years ago

Scienceseb commented 3 years ago

Hi thanks a lot for that implementation !!!

I have a question, I want to control the number of images output by the the VQVAE model, but your VQVAE model dont take that as an argument (the number of output). Is your code just VQ-VAE for the moment or is it Video VQ-VAE ?

Thanks a lot!

mattiasxu commented 3 years ago

Hi, thanks for checking it out! So let's see if I understood correctly...

It should be Video VQ-VAE, but I suppose it could also be used on all 3D image data. It takes video input in the form (channels, time, width, height).

Currently, only the compression of video part of the "Predicting Video with VQVAE" is done. Later I will implement the autoregressive model to predict/generate video in the latent space.

By output, do you mean the compressed representation in latent space it creates? This can be tuned by adjusting the kernel_sizes and padding of the 3D convolutions. If not, what is implemented just outputs something that is compressed and decompressed, so input and output shape is the same.

It should be Video VQ-VAE, but I suppose it could also be used on all 3D image data. It takes input in the form (channels, time, width, height).

Scienceseb commented 3 years ago

Thanks a lot for your lightning fast reply!!! So by output I mean the output of the autoregressive model to predict/generate video. So I think I will have to wait and let you implement that part haha ! Also I want to use your implementation for the prediction of video sequence like in the original paper, input 8 frames and output 4 frames or 16 frames for exemple.