Open tnarek opened 2 months ago
I was also wondering about the same. It seems like eq 3 has handle positions in pixel coordinates (not in natural image coordinates (0,1)^2 as I expected) so each update tries to move the point slightly. Also note that if handle position is already close to the target we skip corresponding loss term - see here https://github.com/Yujun-Shi/DragDiffusion/blob/ebe659a9c5b722f25d9690e74d813fca96531f97/utils/drag_utils.py#L133
As for 2 my guess is that we want to update features gradually. Some points in $\Omega$-region may change completely (think about the statue example - if the handle point is on the nose, under extreme rotations we want to have background pixels)
I would love to hear some confirmation from the authors
Hello,
Thanks for sharing your great work!
I have 2 questions regarding motion supervision: