Question about Motion Supervision and Point Tracking

Yujun-Shi / DragDiffusion

[CVPR2024, Highlight] Official code for DragDiffusion

Apache License 2.0

1.16k stars 88 forks source link

I was also wondering about the same. It seems like eq 3 has handle positions in pixel coordinates (not in natural image coordinates (0,1)^2 as I expected) so each update tries to move the point slightly. Also note that if handle position is already close to the target we skip corresponding loss term - see here https://github.com/Yujun-Shi/DragDiffusion/blob/ebe659a9c5b722f25d9690e74d813fca96531f97/utils/drag_utils.py#L133

As for 2 my guess is that we want to update features gradually. Some points in $\Omega$-region may change completely (think about the statue example - if the handle point is on the nose, under extreme rotations we want to have background pixels)

I would love to hear some confirmation from the authors

Yujun-Shi / DragDiffusion

Question about Motion Supervision and Point Tracking #70