Closed Little-Podi closed 9 months ago
Thanks for your question and happy mid-autumn festival! We adhere to the standards set by ZeroScope to ensure easy and unbiased comparisons. Additionally, LDM's upsampling can adapt well to minor ratio variations.
Moreover, finetune the LDM model for slightly ratio or size change is quite fast(about 4000-6000 steps).
I see. Thanks for your detailed answer.
Happy mid-autumn festival and congrats to your insightful work. I have a minor question about the resolution. The pixel-based VDM generates a video of frame size $256\times160$, while the LDM upsamples the video frames to $576\times320$. Does it mean the ratio of size is changed? I am just curious why LDM is not performing at $512\times320$.