Sample occluded points during training

facebookresearch / co-tracker

CoTracker is a model for tracking any point (pixel) on a video.

https://co-tracker.github.io/

Other

2.71k stars 194 forks source link

Sample occluded points during training #79

Open Anderstask1 opened 5 months ago

Anderstask1 commented 5 months ago

Hi,

Do I need to exclude points not visible at any time in the window during training? I have data including points that is occluded for a number of frames bigger than the window. I have the impression that exclusion of these points lead to worse performance since the information in joint tracking is reduced, but inclusion of these points reduce the ability to predict occlusion/visibility. Do you have any recommendations here?

Thanks!

nikitakaraevv commented 5 months ago

Hi @Anderstask1, I think excluding completely invisible points is a good idea, because they don't contribute to training anyway (unless you have at least one frame where the point is visible, then that's the frame where you query the point).

The fact that points are occluded for more than one window is ok, we also have such points in the training set. The model can propagate predictions through multiple windows and keep tracking points even when they stay invisible for more than one window. By the way, what data are you training on?

Anderstask1 commented 4 months ago

Hi again, and thanks for the quick reply. When I try to train CoTracker with points that are occluded for the entire window, I end up with the following error. Do I need to explicitly remove these points to avoid a crash?

'File "/CoTracker/code/train.py", line 483, in run output = forward_batch( File "/CoTracker/code/train.py", line 75, in forward_batch [ File "/CoTracker/code/train.py", line 76, in nonzero_row[torch.randint(len(nonzerorow), size=(1,))] RuntimeError: random expects 'from' to be less than 'to', but got from=0 >= to=0'

nikitakaraevv commented 4 months ago

Hi @Anderstask1, yes, you either need to remove these points or modify the logic of queried points' sampling to completely ignore the invisible tracks.