liruilong940607 / Pose2Seg

Code for the paper "Pose2Seg: Detection Free Human Instance Segmentation" @ CVPR2019.
http://www.liruilong.cn/projects/pose2seg/index.html
MIT License
532 stars 136 forks source link

Got wrong test results about AP(area=medium) #21

Closed IreneLu12 closed 5 years ago

IreneLu12 commented 5 years ago

Hi, I just cloned the code and installed cocoAPI from GitHub. But I got wrong test results about AP(area=medium) when I ran test.py as your introductions in OCHuman dataset. The model is 'pose2seg_release.pkl' and the test results are as follows.

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.573 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.945 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.637 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.073 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.580 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.422 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.682 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.682 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.550 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.682 [POSE2SEG] AP|.5|.75| S| M| L| AR|.5|.75| S| M| L| [segm_score] OCHumanVal 0.573 0.945 0.637 -1.000 0.073 0.580 0.422 0.682 0.682 -1.000 0.550 0.682

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.547 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.937 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.582 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.064 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.549 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.379 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.649 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.649 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.300 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.650 [POSE2SEG] AP|.5|.75| S| M| L| AR|.5|.75| S| M| L| [segm_score] OCHumanTest 0.547 0.937 0.582 -1.000 0.064 0.549 0.379 0.649 0.649 -1.000 0.300 0.650

Why the AP(area=medium) are extremely low?

And I also have another question about the OCHuman dataset. I updated the dataset to the latest version, but I found the validation contains 4291 instances and the test contains 3819 instances, which is still not consistent with the description in the paper. Is there a problem with my dataset?