Closed Wulin-Tan closed 3 months ago
you are correct that epsilon means a threshold for the same keypoint in 2 consecutive frames. as long as the distance traveled by the keypoint between those frames is less than epsilon, there will be no penalty. this loss does suffer from the issues you mentioned - sometimes the animal is moving fast, and epsilon should be large, and sometimes it moves slow, and epsilon could be small. also, some keypoints might move a lot while others not at all. we played around with this loss a decent amount and found that it is most effective when epsilon is rather large - say, the longest distance any keypoint could travel throughout the entire video. in this case the temporal loss is less about ensuring smooth trajectories, and more about making sure there aren't large jumps from one frame to the next.
you can see from the ablation studies we performed (extended data figure 3) that the pca losses are much stronger than the temporal loss in the datasets we looked at.
there is a parameter in losses called epsilon. LP tutorial mentioned that:
so epsilon here means a threshold for the same keypoint in each 2 consecutive frames? and how to set epsilon if the keypoints movement is variable across time? like an animal might move quickly, which needs a big epsilon, but stay still in some frames, which needs a small epsilon?