Closed ghost closed 3 years ago
What is it that I can exactly help you with here?
Is there any issue with the evaluation code? Or just that your reimplementation of SORT doesn't achieve identical results to the original paper?
It is common for reimplementations to not achieve exactly the same results due to a large number of different hyper-parameters to set.
Furthermore, it seems as though the results you posted (MOTA = 37.682) are actually BETTER than those from the original paper (MOTA = 34.0), so that is good isn't it? (Isn't that what you want?)
One reason your results are likely better is because you are probably using a different detector (there are many different version of FRCNN detections and not all are the same). I don't know what one the original SORT used, but it is very likely not the same as in the det.txt file provided (although I am not sure of this).
Overall though, I don't see a problem that I can help you with here.
Let me know if there is something else that I have missed.
For now, I will close this, but feel free to re-open it if I haven't answered well enough,.
Hello again,
let me clarify my situation, i did not re-implement SORT, I used it as is. I followed their git repo (https://github.com/abewley/sort) and paper (https://arxiv.org/pdf/1602.00763.pdf). I did not use a frcnn detector at all, I just used the detections provided in the git repo of SORT. I have not repeated the same evaluation in the depreciated version of mot metric calculator (https://github.com/dendorferpatrick/MOTChallengeEvalKit) yet so I cannot compare the results with your evaluator now. I hope I am clearer than before. Thank you.
Fatih.
Okay. Makes sense, but without more info I can't know why there is a difference to the paper.
It would be interesting if you ran the depreciated version like you said and see if there is the same score as for this repo.
If it's the same (or very similar, there are some known very minor differences), then there is no problem with this repo, and maybe this is an issue for the SORT repo instead??
Jono
I evaluated by using py-motmetrics repo (https://github.com/cheind/py-motmetrics) and the results are pretty close to the results of HOTA TrackEval except MOTP score. I am going to investigate the inconsistency between the paper and TrackEval results on SORT repo by opening an issue as you suggested. Thank you for your guidence and great work by serving this great repo to the MOT community.
Fatih.
Yeah the MOTP is defined here as 1-MOTP (for historical reasons).
Cool, I guess this is closed then
Now everything is clear! The results presented on the Paper are for the TEST sequences of MOT15, evaluated on MOT benchmark test server. My results are for the TRAINING sequences. I was comparing apple and pear and questioning why they are not the same. Sorry for my mistake!
Fatih.
Glad to hear it's all cleared up. :)
Hello,
I have been experiencing difficulties to reproduce MOTChallenge scores of SORT tracker for MOT15 dataset. I used det.txt provided in git repo of the paper (frcnn detections). I have attached my results in .csv and .txt formats. Any help would be appreciated. Thanks.
pedestrian_detailed.csv pedestrian_summary.txt
https://arxiv.org/pdf/1602.00763.pdf
Fatih