Official devkit counts ID switches differently

jvlmdr commented 3 years ago

Creating a new issue for this specific problem that arose in #126

The official devkit uses a different technique to count identity switches (for MOTA). The difference arises in how identities are preserved from past frames. For context, when establishing per-frame correspondence between tracks to compute MOTA, priority is given to existing associations to avoid an identity switch when there are multiple candidates. The official devkit gives priority to associations in the previous frame whereas py-motmetrics gives priority to the most recent association of each ground-truth (in any past frame).

The official devkit starts with an empty map M[t] in each frame t and adds associations made in that frame: https://github.com/dendorferpatrick/MOTChallengeEvalKit/blob/3c7a1d66884043c1cb50253673898248c3c35b17/matlab_devkit/utils/clearMOTMex.cpp#L289-L292 https://github.com/dendorferpatrick/MOTChallengeEvalKit/blob/3c7a1d66884043c1cb50253673898248c3c35b17/matlab_devkit/utils/clearMOTMex.cpp#L373-L380

py-motmetrics instead maintains a single self.m dict whose state is modified over the course of the sequence: https://github.com/cheind/py-motmetrics/blob/6597e8a4ed398b9f14880fa76de26bc43d230836/motmetrics/mot.py#L241 https://github.com/cheind/py-motmetrics/blob/6597e8a4ed398b9f14880fa76de26bc43d230836/motmetrics/mot.py#L294

Note that the official devkit still examines past frames to count the number of identity switches, it is only the method of preserving identities that differs: https://github.com/dendorferpatrick/MOTChallengeEvalKit/blob/3c7a1d66884043c1cb50253673898248c3c35b17/matlab_devkit/utils/clearMOTMex.cpp#L404-L408

I am not sure which approach is correct. The CLEAR MOT metrics paper states:

If a correspondence (o_i, h_k) is made that contradicts a mapping (o_i, hj) in M{t−1}, replace (o_i, h_j) with (o_i, h_k) in M_t.

The word "replace" seems to suggest the technique used by py-motmetrics.

It seems that the approach used by py-motmetrics would also result in a smaller number of identity switches , since identities are preserved through frames where the ground-truth or predicted track is not present (occlusions and false negatives).

@JonathonLuiten pointed out to me that the py-motmetrics approach is ambiguous. In particular, if two different ground-truth tracks were both previously associated to the same predicted track (in different frames) and then the code encounters a frame in which both ground-truth tracks overlap with the predicted track, it is unclear which pair of tracks should be associated.

In this situation, py-motmetrics currently takes the match with the lowest index, which is deterministic but arbitrary: https://github.com/cheind/py-motmetrics/blob/6597e8a4ed398b9f14880fa76de26bc43d230836/motmetrics/mot.py#L230-L234

One possible alternative would be to ensure that the previous correspondence between ground-truth and predicted tracks is bidirectionally exclusive.

jvlmdr commented 3 years ago

I did a comparison of 88 trackers on the MOT17 training set. All comparisons are relative to the default py-motmetrics approach.

Preserving identities from only the previous frame (as in the official devkit) results in: +1.5 extra ID switches per sequence on average (std 4.2), or +1.9% relative (std 5.9) -0.013 points MOTA (std 0.05)

Enforcing the previous association to be bidirectionally exclusive results in: +0.2 extra ID switches per sequence (std 1.3), or +0.15% relative (std 1.0) -0.0043 points MOTA (std 0.0412)

shensheng27 commented 3 years ago

https://github.com/cheind/py-motmetrics/pull/127, this will also result in more less idt than fact. I think mot devit match way is better for clear out “deterministic but arbitrary” match.

cheind / py-motmetrics

Official devkit counts ID switches differently #132