Open FrancescoPiemontese opened 5 years ago
I think the author used the detection information from the dataset.(mpii dataset json file 'center' 'scale')
@FrancescoPiemontese maybe you can refer to this myhrnet, I integrated yolo human detection.
Thank you! I will try
@lxy5513 , will you consider make a PR to this repo?
@leoxiaobin yes, soon after, I will add several human detection, like R-FCN, RetineNet, then do PR and speed description.
@leoxiaobin I prepare to do this track by your simple-baseline
paper description
For the processing frame in videos, the boxes from a human detector and boxes generated by propagating joints from previous frames using optical flow are unified using a bounding box Non-Maximum Suppression (NMS) operation
I have two group boxes, but I don't how to do NMS, because boxes generated by flownet2S, which have no confidence scoces, could I can default think the score is previous frame boxes scors ? Could you tell me the problem, thank you advance.
We actually use the OKS score for NMS.
Thanks
@leoxiaobin Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.
ONE, I get dt_boxes from yolo then python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False
, and get rid of oks nms
, get result as follow:
Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
---|---|---|---|---|---|---|---|---|---|---|
pose_hrnet | 0.702 | 0.859 | 0.770 | 0.653 | 0.779 | 0.736 | 0.878 | 0.794 | 0.683 | 0.813 |
TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :
Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
---|---|---|---|---|---|---|---|---|---|---|
hrnet | 0.594 | 0.811 | 0.656 | 0.564 | 0.651 | 0.647 | 0.834 | 0.704 | 0.601 | 0.713 |
Could you please tell why the two results is so different, Thank you in advance
this is my script https://github.com/lxy5513/hrnet/blob/master/tools/eval.py which get keypoints json file , by the way, my YOLOv3 threshold is 0.1
I have a very quick look through your code. I have two questions.
It seems that you do not convert image's channel to RGB. Opencv reads image as BGR channel. Our model are trained using RGB channel. So you need first convert your image data to RGB channel like line131 at https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/blob/master/lib/dataset/JointsDataset.py#L131.
Are the threshold for both metheds same?
I am greatly appreciate for your attention this is my convert channel code: https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L142 .
this is my relative threshold code, they are same for two methods. https://github.com/lxy5513/hrnet/blob/master/tools/eval.py#L159
By the way, my use yolov3 + simple-baseline pose model
, test the PR, it seem normal, as follow:
Arch | AP | Ap .5 | AP .75 | AP (M) | AP (L) | AR | AR .5 | AR .75 | AR (M) | AR (L) |
---|---|---|---|---|---|---|---|---|---|---|
Simple-baseline | 0.648 | 0.856 | 0.708 | 0.617 | 0.706 | 0.697 | 0.880 | 0.750 | 0.652 | 0.763 |
I would say this issue can be closed with #161 being merged
@leoxiaobin Hi, I made a PR for yolov3-HRnet, however something wired. I use two ways.
ONE, I get dt_boxes from yolo then
python tools/test.py TEST.USE_GT_BBOX False TEST.FLIP_TEST False
, and get rid ofoks nms
, get result as follow:Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L) pose_hrnet 0.702 0.859 0.770 0.653 0.779 0.736 0.878 0.794 0.683 0.813 TWO, I use end-to-end two model(same model like ONE), and get keypoins, then save into json, finally I get result by official cocoEval.evaluate() , as follow :
Arch AP Ap .5 AP .75 AP (M) AP (L) AR AR .5 AR .75 AR (M) AR (L) hrnet 0.594 0.811 0.656 0.564 0.651 0.647 0.834 0.704 0.601 0.713 Could you please tell why the two results is so different, Thank you in advance
i also get 0.702,but the implementation about w32_256*192 is 0.744,why?i just run the implementaion code with the trained model pose_hrnet_w32_256x192.pth.can you help me?
First of all thank you for your excellent work. I have a question regarding person detection. In your paper it is mentioned that you use a person detector before feeding its output to the HRNet. Am I supposed download this separately and then feed its output to the HRNet? If so, what do the dataloaders in train and test.py do? Would it be possible for you to tell me which person detector has been used?