MCG-NJU / MeMOTR

[ICCV 2023] MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
https://arxiv.org/abs/2307.15700
MIT License
157 stars 11 forks source link

How to get bbox for occluded frames using motion params? #12

Closed danial880 closed 7 months ago

danial880 commented 8 months ago

Hi, my tracker is performing well it keeps track of the object even if it is occluded for 10 to 15 frames. Now how can I get BBox for these occluded frames? I have seen motion parameters in runtime tracker. How can I make use of those params? Screenshot from 2024-02-24 00-49-18

HELLORPG commented 8 months ago

MeMOTR can not locate the occluded target. Because during training, most datasets do not annotate the occluded target (except MOT17), thus we expect the classification confidence of these occluded targets to be 0 and do not supervise their bounding boxes.

The motion module you mentioned is a deprecated processing. In our MeMOTR, the reference point of an occluded target is its last seen position. To use this motion-based processing, we roughly estimate the linear trajectory of the disappeared object to predict its possible re-appearance position, which is used as the reference point for subsequent frames. However, this process did not bring a significant improvement. On DanceTrack, it only improves no more than 0.5 HOTA. In line with less is more, I did not use this module in our final version.

HELLORPG commented 8 months ago

I'm not sure whether this motion-based module is ready in this open-source version. Here are the docs for this module:

One more thing, motion_lambda decides how much we trust this simple linear trajectory estimation.

danial880 commented 8 months ago

Thanks for the info. In demo notebook we are using RuntimeTracker class and there is no motion_lambda implemented there.

HELLORPG commented 8 months ago

RuntimeTracker directly receives the previous tracks for updating. And the reference point is included in the param tracks: List[TrackInstances]. Therefore, you should use motion_lambda to compute the reference point outside and then input it in the RuntimeTrack.update. Like here we did.