sihyun-yu / PVDM

Official PyTorch implementation of Video Probabilistic Diffusion Models in Projected Latent Space (CVPR 2023).
https://sihyun.me/PVDM
MIT License
299 stars 15 forks source link

a general question regarding videogpt #13

Closed dialuser closed 1 year ago

dialuser commented 1 year ago

Hi your paper shows PVDM beat VideoGPT by a large margin. I wonder if you can offer more insights. VideoGPT also uses a two step process, first training a VQVAE, and then end-to-end autoregression. Do you think the main difference lies in the diffusion part? Thanks.

sihyun-yu commented 1 year ago

There are two main differences: (1) we use triplane latents of cubic tensor and (2) and we use strong diffusion model.