JonathonLuiten / TrackEval

HOTA (and other) evaluation metrics for Multi-Object Tracking (MOT).
MIT License
968 stars 241 forks source link

SORT Paper Scores #35

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hello,

I have been experiencing difficulties to reproduce MOTChallenge scores of SORT tracker for MOT15 dataset. I used det.txt provided in git repo of the paper (frcnn detections). I have attached my results in .csv and .txt formats. Any help would be appreciated. Thanks.

pedestrian_detailed.csv pedestrian_summary.txt

https://arxiv.org/pdf/1602.00763.pdf

Fatih

JonathonLuiten commented 3 years ago

What is it that I can exactly help you with here?

Is there any issue with the evaluation code? Or just that your reimplementation of SORT doesn't achieve identical results to the original paper?

It is common for reimplementations to not achieve exactly the same results due to a large number of different hyper-parameters to set.

Furthermore, it seems as though the results you posted (MOTA = 37.682) are actually BETTER than those from the original paper (MOTA = 34.0), so that is good isn't it? (Isn't that what you want?)

One reason your results are likely better is because you are probably using a different detector (there are many different version of FRCNN detections and not all are the same). I don't know what one the original SORT used, but it is very likely not the same as in the det.txt file provided (although I am not sure of this).

Overall though, I don't see a problem that I can help you with here.

Let me know if there is something else that I have missed.

For now, I will close this, but feel free to re-open it if I haven't answered well enough,.

ghost commented 3 years ago

Hello again,

let me clarify my situation, i did not re-implement SORT, I used it as is. I followed their git repo (https://github.com/abewley/sort) and paper (https://arxiv.org/pdf/1602.00763.pdf). I did not use a frcnn detector at all, I just used the detections provided in the git repo of SORT. I have not repeated the same evaluation in the depreciated version of mot metric calculator (https://github.com/dendorferpatrick/MOTChallengeEvalKit) yet so I cannot compare the results with your evaluator now. I hope I am clearer than before. Thank you.

Fatih.

JonathonLuiten commented 3 years ago

Okay. Makes sense, but without more info I can't know why there is a difference to the paper.

It would be interesting if you ran the depreciated version like you said and see if there is the same score as for this repo.

If it's the same (or very similar, there are some known very minor differences), then there is no problem with this repo, and maybe this is an issue for the SORT repo instead??

Jono

ghost commented 3 years ago

I evaluated by using py-motmetrics repo (https://github.com/cheind/py-motmetrics) and the results are pretty close to the results of HOTA TrackEval except MOTP score. I am going to investigate the inconsistency between the paper and TrackEval results on SORT repo by opening an issue as you suggested. Thank you for your guidence and great work by serving this great repo to the MOT community.

Screenshot from 2021-05-05 20-06-53

Fatih.

JonathonLuiten commented 3 years ago

Yeah the MOTP is defined here as 1-MOTP (for historical reasons).

Cool, I guess this is closed then

ghost commented 3 years ago

Now everything is clear! The results presented on the Paper are for the TEST sequences of MOT15, evaluated on MOT benchmark test server. My results are for the TRAINING sequences. I was comparing apple and pear and questioning why they are not the same. Sorry for my mistake!

Fatih.

JonathonLuiten commented 3 years ago

Glad to hear it's all cleared up. :)