j96w / DenseFusion

"DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion" code repository
https://sites.google.com/view/densefusion
MIT License
1.1k stars 300 forks source link

potential leaky information was uesd in eval linemod datasets #45

Closed flowtcw closed 5 years ago

flowtcw commented 5 years ago

In eval_linemod.py, the code still use the rmin, rmax, cmin, cmax = get_bbox(meta['obj_bb']) to get the rmin, rmax, cmin, cmax. That's the important process for the image crop. In my opinion, gt.yaml is the groundtruth for the objects, and the obj_bb is the 2d bounding box. I don't know whether the code is right. Maybe I was wrong. thank you =。=

j96w commented 5 years ago

We can't use the ground truth info during the evaluation. The segmentation masks are the output of the trained SegNet. We have to generate the bbox from the segment. Even when the poor segment leads to poor bbox that fails the pose estimation, we still need to count this failure into the overall score as we did in this work on both datasets.

hygxy commented 5 years ago

@j96w So you mean the bbox in gt.yaml is actually generated by segnet_results? if that's the case why do you use the same bbox for training? I am especially confused about the line (122) in dataset.py. The rgb image is cropped into image_masked using bbox from gt.yaml, both for training and evaluation. IMO they should be treated differently, am I right?

flowtcw commented 5 years ago

@j96w So you mean the bbox in gt.yaml is actually generated by segnet_results? if that's the case why do you use the same bbox for training? I am especially confused about the line (122) in dataset.py. The rgb image is cropped into image_masked using bbox from gt.yaml, both for training and evaluation. IMO they should be treated differently, am I right?

I agree with you, the key problem is about bbox, not mask. I saw a lot of papers about 6d pose estimation, no one uses the bbox_gt as input. If you test Occlusion Linemod datasets, you may found that in some pictures, detecting objects is very hard. That's why I think potential leaky information was uesd in eval.

j96w commented 5 years ago

Hi, I finally get time to look into this issue. Yes, you are right. We should separate the bbox selection between training and evaluation. I have just pushed an update to the repo. And after fixing the bug, the result(iterative) I get is: Object 1 success rate: 0.9084842707340324 Object 2 success rate: 0.9233753637245393 Object 4 success rate: 0.946078431372549 Object 5 success rate: 0.9232283464566929 Object 6 success rate: 0.9580838323353293 Object 8 success rate: 0.8424182358771061 Object 9 success rate: 0.9117370892018779 Object 10 success rate: 0.9981185324553151 Object 11 success rate: 0.9980694980694981 Object 12 success rate: 0.8934348239771646 Object 13 success rate: 0.975485188968335 Object 14 success rate: 0.9577735124760077 Object 15 success rate: 0.9250720461095101 ALL success rate: 0.9354670247687258 We will update this result in the next arxiv version. Thanks~

flowtcw commented 5 years ago

Hi, I finally get time to look into this issue. Yes, you are right. We should separate the bbox selection between training and evaluation. I have just pushed an update to the repo. And after fixing the bug, the result(iterative) I get is: Object 1 success rate: 0.9084842707340324 Object 2 success rate: 0.9233753637245393 Object 4 success rate: 0.946078431372549 Object 5 success rate: 0.9232283464566929 Object 6 success rate: 0.9580838323353293 Object 8 success rate: 0.8424182358771061 Object 9 success rate: 0.9117370892018779 Object 10 success rate: 0.9981185324553151 Object 11 success rate: 0.9980694980694981 Object 12 success rate: 0.8934348239771646 Object 13 success rate: 0.975485188968335 Object 14 success rate: 0.9577735124760077 Object 15 success rate: 0.9250720461095101 ALL success rate: 0.9354670247687258 We will update this result in the next arxiv version. Thanks~

Thank you very much for doing this in your busy schedule. I want to know what dataset you test, so I can make sure the impact of these changes.

j96w commented 5 years ago

The changes are for Linemod. The evaluation of YCB was originally correct.

j96w commented 5 years ago

Hi, after adding 2 more times of iterative refinement, the current testing result on Linemod after fixing the bbox bug is:

Object 1 success rate: 0.9285033365109628 Object 2 success rate: 0.944713870029098 Object 4 success rate: 0.9725490196078431 Object 5 success rate: 0.9409448818897638 Object 6 success rate: 0.9650698602794411 Object 8 success rate: 0.8741328047571854 Object 9 success rate: 0.9305164319248826 Object 10 success rate: 0.9971777986829727 Object 11 success rate: 0.9980694980694981 Object 12 success rate: 0.9248334919124643 Object 13 success rate: 0.9816138917262512 Object 14 success rate: 0.9692898272552783 Object 15 success rate: 0.9606147934678194 ALL success rate: 0.9529245001492092

This result is even higher than the reported 94.3%(all success rate) in the paper. So nothing needs to modify on the paper. Thank you all.

flowtcw commented 5 years ago

Hi, after adding 2 more times of iterative refinement, the current testing result on Linemod after fixing the bbox bug is:

Object 1 success rate: 0.9285033365109628 Object 2 success rate: 0.944713870029098 Object 4 success rate: 0.9725490196078431 Object 5 success rate: 0.9409448818897638 Object 6 success rate: 0.9650698602794411 Object 8 success rate: 0.8741328047571854 Object 9 success rate: 0.9305164319248826 Object 10 success rate: 0.9971777986829727 Object 11 success rate: 0.9980694980694981 Object 12 success rate: 0.9248334919124643 Object 13 success rate: 0.9816138917262512 Object 14 success rate: 0.9692898272552783 Object 15 success rate: 0.9606147934678194 ALL success rate: 0.9529245001492092

This result is even higher than the reported 94.3%(all success rate) in the paper. So nothing needs to modify on the paper. Thank you all.

I think that's a surprise gift. :)