Cannot match `trackeval` HOTA results using `motmetrics` on custom dataset

mikel-brostrom commented 2 weeks ago

I have 2 sequences which I generated motchallenge compliant results for. First lines in my (frame, id, l, t, w, h, conf, class) dataset format:

2,3,1436,409,159,371,0.508485,0
2,2,583,442,93,264,0.689593,0
2,1,1338,425,147,356,0.692870,0
3,3,1462,417,136,360,0.542826,0
3,2,580,441,102,272,0.721693,0
3,1,1340,414,160,374,0.765668,0
4,3,1469,419,130,357,0.566036,0
4,2,583,444,93,264,0.683354,0
4,1,1344,413,155,375,0.739091,0
...

When running trackeval I get:

HOTA: -pedestrian                  HOTA      DetA      AssA      DetRe     DetPr     AssRe     AssPr     LocA      OWTA      HOTA(0)   LocA(0)   HOTALocA(0)
MOT17-02-FRCNN                     23.093    8.3317    64.504    8.3732    81.871    65.874    87.832    82.856    23.164    27.696    80.364    22.257    
MOT17-04-FRCNN                     21.202    5.7172    79.963    5.7487    83.982    80.793    94.95     85.394    21.273    23.701    81.903    19.412    
COMBINED                           21.684    6.2585    76.638    6.2934    83.388    77.651    94.067    84.643    21.76     24.584    81.47     20.029

Using motmetics by the code snippet suggested by @Justin900429 here I get:

MOT17-04-FRCNN: HOTA: 13.836 | AssA: 79.963 | DetA: 2.433
MOT17-02-FRCNN: HOTA: 17.367 | AssA: 64.504 | DetA: 4.710

AssA seems to match, but not DetA nor HOTA. What am I missing?

Justin900429 commented 1 week ago

Looks like a critical issue 😓 I'll take a look. Thanks a lot.

mikel-brostrom commented 1 week ago

Let me know if you want me to provide the files used for this small experiment 😊

Justin900429 commented 4 days ago

After some investigation, I found the main issue is the difference in the loading approach used by TrackEval and py-motmetric. For instance, I tried using the data.zip from here for MOT20-01 and got a different number of detections and ground-truths. Also, the ID for py-motmetrics should be started from 1, which is not an issue for TrackEval. Matching the loading logic might solve the problem, but it requires some time to bridge the gap.

mikel-brostrom commented 3 days ago

I remember achieving identical results on MOTA and IDF1 in a comparison I did like a year ago. And yes, both in the gt and my predicted mot results the ID starts from 1

cheind / py-motmetrics

Cannot match `trackeval` HOTA results using `motmetrics` on custom dataset #196