SysCV / qdtrack

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)
Apache License 2.0
385 stars 61 forks source link

Can you explain the concept of backdrop in your paper #71

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi. The paper is very interesting but I cannot understand backdrop. It seems to me that you guys don't drop unmatched tracks at all and keep them for the future just in case an object ends up matching with them and so you then match those detections with the previous unmatched tracks.

Are these backdrops kept in memory for the entire inference period or do you just eliminate the unmatched track completely after a few frames? I am asking this because if you guys do keep track of unmatched track then, realistically, when I look at an object and turn away for a while and look at that same object again, QDTRACK should be able to match this new object as having belonged to a previous track for that object.

But that never happens in my experiments and the same object is assigned an entirely new track.

OceanPang commented 3 years ago

Hi, the backdrops are kept to make the matching process follow the one-to-one matching principle, and they are all from the last frame (one frame).

An example: There is a unmatched objects in frame t, with a classification score 0.4, which is lower than our threshold that it will be disgarded in normal cases. But in our experiments, we will keep this object in the memory for one frame. When the matching go to the next frame t+1, the detected objects in this frame will more likely to have their matching targets, which are in the backdrops.

For your cases, if one track is disappeared for several frames, their feature embeddings are kept so that we can re-identify the same track if the cosine similarity is high. The accuracy of this operation depends on the representation ability of the embedding head, and it may not works sometimes.