Closed Dorniwang closed 6 months ago
Yes, since latent partitioning requires more computation for generating one frame than vanilla diagonal denoising, you might choose either using multiple GPUs or slower inference. However, latent partitioning with n=4 uses 64 inference steps (16×4), which is not slower than original video diffusion models (they often use 50 to 150 steps for inference). In fact, it is much faster than VDMs when using multiple GPUs.
Yes, since latent partitioning requires more computation for generating one frame than vanilla diagonal denoising, you might choose either using multiple GPUs or slower inference. However, latent partitioning with n=4 uses 64 inference steps (16×4), which is not slower than original video diffusion models (they often use 50 to 150 steps for inference). In fact, it is much faster than VDMs when using multiple GPUs.
Got, thanks
From sec.4.2 of the paper, it seems like latent partitioning, which improve quality by reducing the gap between training and inference, increases denoising step, thus we need to use multi gpus to speed up or we just get a slower inference process?