Closed leoShen917 closed 10 months ago
Hello, thanks for your interests in our work! As elaborated in our report, the diffusion latent is continually optimized (e.g., optimized 40 or 60 gradient steps) to move handle points towards the target points. Therefore, this video is generated by saving and denoising all the intermediate latents during the optimization process. In this way, we can visualize a "trajectory" of how the source images moving towards the target images.
Sometimes, when the target points are too far away from the handle points, you need to set the parameter "number of pixel steps" to be larger values (e.g., 80 to 100).
Congrats on your great work and thank you for releasing the code. I am trying to replicate the output results showing in the project page.
https://github.com/Yujun-Shi/DragDiffusion/assets/83259959/146004d8-82d1-4eb4-88c8-9fa69fe6d3cd
I'm curious if this was generated after a single run, or if it was achieved by multiple iterations, as I've found that just one run often fails to generate to the target point.
Looking for more guidance on the technical details, thanks again!