Doubiiu / DynamiCrafter

[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
https://doubiiu.github.io/projects/DynamiCrafter/
Apache License 2.0
2.46k stars 197 forks source link

Questions about interpolation #104

Closed volcverse closed 3 months ago

volcverse commented 3 months ago

Hello, thanks for the great work!

I noticed that there are some inconsistencies between the training and inference phases of interpolation. During training, the first and last c_concat are set to the start and end frames respectively, while the rest are set to 0. However, during inference, the first half of c_concat and the second half of c_concat are set to the start and end frames respectively. Does this inconsistency affect the results? Why not unify the training and inference processes?

Any response will be greatly appreciated!

Doubiiu commented 3 months ago

Hi I think the training and inference performance are consistent as "the first and last c_concat are set to the start and end frames respectively, while the rest are set to 0." Can you point out the code for "the first half of c_concat and the second half of c_concat are set to the start and end frames respectively". Thanks! Not sure if there is a bug/typo.

volcverse commented 3 months ago

Hi there, thanks for your timely response!

I dig deeper into the inference process and found that there are indeed only the first and last frame are used.

Here is the code that "the first half of c_concat and the second half of c_concat are set to the start and end frames respectively". I check the entire inference. Yes, the training and inference performance are indeed consistent :)

Again, thanks for sharing the great work!