danbider / lightning-pose

Accelerated pose estimation and tracking using semi-supervised convolutional networks.
MIT License
235 stars 34 forks source link

how to understand and set epsilon? #171

Closed Wulin-Tan closed 3 months ago

Wulin-Tan commented 3 months ago

there is a parameter in losses called epsilon. LP tutorial mentioned that:

epsilon: in pixels; temporal differences below this threshold are not penalized, which keeps natural movements from being penalized. The value of epsilon will depend on the size of the video frames, framerate (how much does the animal move from one frame to the next), the size of the animal in the frame, etc.

so epsilon here means a threshold for the same keypoint in each 2 consecutive frames? and how to set epsilon if the keypoints movement is variable across time? like an animal might move quickly, which needs a big epsilon, but stay still in some frames, which needs a small epsilon?

themattinthehatt commented 3 months ago

you are correct that epsilon means a threshold for the same keypoint in 2 consecutive frames. as long as the distance traveled by the keypoint between those frames is less than epsilon, there will be no penalty. this loss does suffer from the issues you mentioned - sometimes the animal is moving fast, and epsilon should be large, and sometimes it moves slow, and epsilon could be small. also, some keypoints might move a lot while others not at all. we played around with this loss a decent amount and found that it is most effective when epsilon is rather large - say, the longest distance any keypoint could travel throughout the entire video. in this case the temporal loss is less about ensuring smooth trajectories, and more about making sure there aren't large jumps from one frame to the next.

you can see from the ablation studies we performed (extended data figure 3) that the pca losses are much stronger than the temporal loss in the datasets we looked at.