16lemoing / dot

Dense Optical Tracking: Connecting the Dots
https://16lemoing.github.io/dot
MIT License
217 stars 11 forks source link

question about the batch size #3

Closed XiaoyuShi97 closed 5 months ago

XiaoyuShi97 commented 6 months ago

Hi, thanks for sharing this great project. I notice that you assert batchsize==1. What's the consideration behind this choice and if I want to increase the batch size, which part should be taken care of? Thanks!

16lemoing commented 6 months ago

Hi! Indeed the functions "get_tracks_for_queries" and "get_tracks_from_first_to_every_other_frame" used in inference / evaluation scripts only support batch sizes of 1.

The main reason is that there is a similar assertion in CoTracker:

https://github.com/16lemoing/dot/blob/0b6c932552c0f49daaef49bdfbdabb847bcabd09/dot/models/shelf/cotracker_utils/predictor.py#L106

16lemoing commented 6 months ago

See also this related issue.

https://github.com/facebookresearch/co-tracker/issues/36

A new version of CoTracker has been released yesterday. Maybe this new version supports batch sizes greater than one.

XiaoyuShi97 commented 6 months ago

Thanks for your patience and reply. May I ask another question. What is the purpose of this iteration, as _ is not called? https://github.com/16lemoing/dot/blob/main/dot/models/point_tracking.py#L59

16lemoing commented 6 months ago
for _ in tqdm(range(N // S), desc="Track batch of points", leave=False):

This part of the code is the computation of initial tracks. N is the total number of initial tracks and S is the number of simultaneous tracks, so N // S is the number of batches of tracks. _ is not used since the order of batches does not matter. Tracks are initialized as follows: half of them at motion boundaries and half of them randomly.

XiaoyuShi97 commented 6 months ago

Thanks for your prompt reply. I am still a bit confused. Which logic distinguishes these two types of initialization (at motion boundaries and randomly)?

XiaoyuShi97 commented 6 months ago

https://github.com/16lemoing/dot/blob/main/dot/utils/torch.py#L40 The half and half logic seems to be in this function. If I understand correctly, this line is to compute optical flow for motion boundaries and it only needs to be executed once? https://github.com/16lemoing/dot/blob/main/dot/utils/torch.py#L40 And to iterate B//S times is to reduce the number of simultaneous tracks in co-tracker, for memory consideration?

16lemoing commented 6 months ago

If I understand correctly, this line is to compute optical flow for motion boundaries and it only needs to be executed once?

You are right. We do not need to compute the optical flow for every batch but only once. I have updated the code as follows:

https://github.com/16lemoing/dot/blob/52b3e63afca4e044530818708b7b507982fe3687/dot/models/point_tracking.py#L65-L70

And to iterate B//S times is to reduce the number of simultaneous tracks in co-tracker, for memory consideration?

Yes. The number of simultaneous tracks in co-tracker has an effect on memory and tracking accuracy. You can use the flag --sim_tracks to try different values.

XiaoyuShi97 commented 6 months ago

Thank you so much for your patient and detailed answer!

16lemoing commented 6 months ago

Multi batch is now possible in all training/inference modes (with CoTracker2). Hope this is useful.

XiaoyuShi97 commented 6 months ago

Thanks for your support!