Doubiiu / DynamiCrafter

[ECCV 2024] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Apache License 2.0
2.06k stars 161 forks source link

Inquiry about the Role of use_dynamic_rescale in Training vs. Inference #71

Open gracezhao1997 opened 2 months ago

gracezhao1997 commented 2 months ago

Thanks for the great work! I am currently delving into the functionality of the use_dynamic_rescale parameter in your project and have encountered a point of confusion that I hope you can clarify.

It appears that during the training phase, use_dynamic_rescale is applied to the input data xt (x = x * extract_into_tensor(self.scale_arr, t, x.shape)). However, during inference, it seems the rescaling is performed on the predicted x0(prev_scale_t = torch.full(size, self.ddim_scale_arr_prev[index], device=device)). This discrepancy—wherein the adjustment is made to the inputs during training but to the predictions at inference time—raises questions about the alignment between the training and inference processes. Is there any reference for this strategy?