can not reproduce the tracking results in paper

YuliangXiu commented 7 years ago

I try to reproduce the tracking results in paper based on the posetrack dataset( including 30 test videos ), and finally get this results:

my results: 51.4 61.4 653 757 493 6080 18.6 55.3
paper results: 63.0 64.8 775 502 431 5629 28.2 55.7 metrics: Rcll+ Prcn+ MT+ ML- IDs- FM- MOTA+ MOTP+ my setting is (head, neck, shoulder, frame=3): I also test the pose estimation results, which shows that the results I reproduced is a little better than yours, but the gap existed in tracking test is pretty obvious, a little confusing and strange.

Other tricks have been used ?

umariqb commented 7 years ago

The numbers are too less to be correct. Can you please share your results files, and do you have some logs?
There are no extra tricks. If you are using the provided functions to evaluate the results. That's all what you need.
The poses look good? You are saying that the pose estimation results are better, you are talking about the poses obtained from our code, right?

umariqb commented 7 years ago

Hi, Did you manage to get it running?

YuliangXiu commented 7 years ago

sorry for the late reply... I try to rerun the demo_posetrack.m scipt, but still get the bad results. I found that there may exits some error in original dataset, you will find that annolist(26).annopoints{11, 22} and annolist(26).annopoints{11, 23} both have no point field, so I add if(isfield(ann, 'point')) in pt_convert_mpii_format.m file but I think this is a only small bug that are not responsible for the big gap between my results and yours.

And I am still running the pt-solver, and when I get the final mat file, I will post that file to you so you can test on your scripts.

YuliangXiu commented 7 years ago

I also do this modification

YuliangXiu commented 7 years ago

1-5 57.4 & 58.8 & 277 & 82 & 147 & 48 & 165 & 1305 & 16.7 & 57.8 & 6-10 49.1 & 63.8 & 364 & 79 & 146 & 139 & 39 & 720 & 20.9 & 51.7 & 11-15 68.1 & 69.4 & 343 & 139 & 160 & 44 & 77 & 748 & 37.4 & 56.8 & 16-20 44.0 & 58.8 & 457 & 103 & 177 & 177 & 21 & 765 & 13.0 & 53.9 & 21-25 47.4 & 59.4 & 530 & 118 & 268 & 144 & 81 & 1377 & 14.6 & 52.6 & 26-30 44.6 & 62.6 & 545 & 132 & 208 & 205 & 110 & 1165 & 17.4 & 55.8 &

I test 30 videos seperatly and above are the pose_tracking results

umariqb commented 7 years ago

Have you checked that the solver calls are always completed correctly? If you don't have enough memory, the solver can crash sometimes.

umariqb commented 7 years ago

OK, I am also trying to reproduce the results from scratch. Will keep you posted!

YuliangXiu commented 7 years ago

The solver works well, it will take quite a long time but it did not crash, I think my computer have enough memory. I evaluate the final results with RMPE and OpenPose( which replace the deepcuter module), which results are as follows( the tracking implementation is naive, I am learning the weights of box-score and deepmatching ratio): rmpe: 50.0 53.2 609 1142 765 1154 7588 5.0 45.4 posetrack: 51.4 61.4 653 1106 757 493 6080 18.6 55.3
openpose: 48.8 59.1 615 1083 818 5275 6413 10.2 50.8

The results seems close in precision and recall, so I guess maybe the gap is caused by dataset, maybe you can check it later.

YuliangXiu commented 7 years ago

and here is my repo version: https://github.com/YuliangXiu/PoseTrack-CVPR2017 you can see the commits log of my repo to find my modification

umariqb commented 7 years ago

Can you also visualize the results and share them? Here is the code: save_qualitative_results.m.tar.gz

umariqb commented 7 years ago

The solver and the pose estimation part seems to work fine. I will now look into the evaluation code.

YuliangXiu commented 7 years ago

Here is the visualization results: posetrack visulization resutls, and I rerun the whole project again and still get the before results.......

umariqb commented 7 years ago

Thanks a lot for the qualitative results. The results look good. I will try to look into evaluation code soon.

umariqb commented 7 years ago

Hi, you were right. There was some issue with the annolist. I have updated the provided annotations. Please check if it works for you.

YuliangXiu commented 7 years ago

I replace the new annotation files and here are new results of pose tracking: 50.9 & 60.8 & 2516 & 640 & 1106 & 770 & 489 & 6109 & 17.6 & 55.1 & but it is a little worse than before, which is: 51.4 61.4 653 1106 757 493 6080 18.6 55.3
And the new pose estimation results are very close to yours: 54.9 & 52.0 & 42.3 & 32.2 & 23.9 & 31.5 & 31.8 & 38.4

Maybe the hardware of computer will result in different LP results of gurobi ?

antonmil commented 7 years ago

That may indeed be a possible cause. I remember I had a similar issue running gradient descent and getting different results on different hardware. Super annoying.

umariqb commented 7 years ago

But the difference is huge. I think a small difference can be attributed to different computers, but not this much. Let me run everything on a different computer though.

umariqb commented 7 years ago

@YuliangXiu you are right. there were some parameters that were not exactly the same as mentioned in the paper. I have made some minor changes, and using a completely different computer I could now reproduce the following numbers: pt_eval_pose_tracking() 61.5 & 65.2 & 2293 & 731 & 1035 & 527 & 382 & 5742 & 28.2 & 55.7 &

Please checkout the latest version and rerun the code again. You don't need to extract deepmatching and scoremaps again, but please delete the existing detections in the experiment folder.

YuliangXiu commented 7 years ago

Here is my final posetracking results: 66.1 & 62.4 & 2293 & 870 & 975 & 448 & 412 & 5369 & 25.7 & 55.2 & Here is my final mAP results: 54.9 & 52.0 & 42.3 & 32.2 & 23.9 & 31.5 & 31.8 & 38.4 \

I think the implementation is correct, the slight difference may be caused by some different setting of parameters and the different hardwares

Thanks you so much for your patience, I will do some other related work according to your work.

umariqb commented 7 years ago

It seems like you are using different params in pt_eval_tracking. But yes, over all the results look much better than before. I would still not attribute the difference to different computers. Have you checked out the latest version?

umariqb commented 7 years ago

closing since the problem is resolved.

umariqb / PoseTrack-CVPR2017

can not reproduce the tracking results in paper #7