different results from motchallenge devkit

fguney commented 3 years ago

Hi,

I'm getting very different results from mot devkit with the results here. Can you please help me pinpoint the differences? I use the following distance function to compute the distances:

dist = mm.distances.iou_matrix(gt_det[gt_mask], est_det[est_mask], max_iou=0.5)

For example:

for DPM py-motmetrics: IDF1 IDP IDR Rcll Prcn GT MT PT ML FP FN IDs FM MOTA MOTP IDt IDa IDm MOT17-09 37.4% 65.8% 26.1% 38.0% 95.5% 64 8 16 40 186 6459 40 42 35.8% 0.247 9 35 4 MOT17-10 36.5% 71.4% 24.5% 33.0% 96.2% 73 10 19 44 226 11695 42 78 31.4% 0.236 11 34 3 OVERALL 36.8% 69.1% 25.1% 34.8% 95.9% 137 18 35 84 412 18154 82 120 33.1% 0.241 20 69 7

mot devkit: Rcll Prcn FAR| GT MT PT ML| FP FN IDs FM| MOTA MOTP MOTAL MOT17-09 34.5 67.7 23.2| 63.5 94.8 0.35| 26 8 14 4| 186 1942 30 30| 59.5 75.4 60.0 MOT17-10 36.5 71.4 24.5| 44.8 96.2 0.35| 57 9 20 28| 226 7082 42 58| 42.7 76.4 43.1 OVERALL 28.6 56.1 19.2| 50.3 95.7 0.35| 83 17 34 32| 412 9024 72 88| 47.6 76.0 48.0

Thanks!

abhineet123 commented 3 years ago

I can confirm that there exists a significant difference. It ssems that py-motmetrics penalizes false positives significantly less than the devkit. I have tested a large number of models on 4 different datasets. Attached are plots comparing the two libraries over 4 metrics: MOTA, IDs, MT and ML. Solid and doted lines respectively show devkit and py-motmetrics results.

I've been using frame-by-frame accumulation of results and have tried both iou_matrix and norm2squared_matrix. Following is the actual code:

       import motmetrics as mm
       acc = mm.MOTAccumulator(auto_id=True)
        dist_func = mm.distances.iou_matrix
        for frame_id in range(gt.n_frames):
            idx1 = gt.idx[frame_id]
            idx2 = track_res.idx[frame_id]
            if idx1 is not None:
                bbs_1 = gt.data[idx1, 2:6]
                ids_1 = gt.data[idx1, 1]
            else:
                bbs_1 = []
                ids_1 = []
            if idx2 is not None:
                bbs_2 = track_res.data[idx2, 2:6]
                ids_2 = track_res.data[idx2, 1]
            else:
                bbs_2 = []
                ids_2 = []
            dist = dist_func(bbs_1, bbs_2)
            acc.update(ids_1, ids_2, dist)
        mh = mm.metrics.create()
        summary = mh.compute(acc, metrics=mm.metrics.motchallenge_metrics, name=seq_name)
        summary = summary.rename(columns=mm.io.motchallenge_metric_names)
        strsummary = mm.io.render_summary(
            summary,
            formatters=mh.formatters
        )

Here, data is a 2D numpy array (num_boxes x 10) holding raw GT / tracking data loaded from an MOT compatible csv file and idx is a list of arrays holding the indices of data points for each frame (None means that there are no boxes in that frame).

motmetrics_vs_devkit.pdf

cheind commented 3 years ago

Hey @fguney, @abhineet123!

Could you provide a MWE, so we could pinpoint the differences?

abhineet123 commented 3 years ago

Here it is: https://github.com/abhineet123/dmdp_mwe

It can run both py-motmetrics and devkit. Instructions for setting up devkit (if needed) and running are in the readme.

cheind commented 3 years ago

@abhineet123 thanks for the mwe. Unfortunately I don't have access to matlab right now, so as far as I understand your readme I cannot run the MWE. Could you provide a MWE in the following form:

a minimal ground-truth / test file (least number of frames with least number of objects that shows a difference between matlab/motmetrics)
a result file that contains the output of matlab?

abhineet123 commented 3 years ago

The mwe can still be run without matlab - just not the devkit portion.

Devkit and motmetrics results are respectively in log/mot_metrics_accumulative_devkit.log and log/mot_metrics_accumulative.log.

I've added results over 5% frames in a single sequence for both mot15 and mot17.

abhineet123 commented 3 years ago

If anyone is still looking for a pure python version of MOT metrics, the HOTA metrics code includes all the MOT metrics as well and these do seem to match the official matlab version in addition to being much faster to compute: https://github.com/JonathonLuiten/HOTA-metrics

cheind / py-motmetrics

different results from motchallenge devkit #115