Res2Net / Res2Net-Pose-Estimation

Res2Net for Pose Estimation using Simple Baselines as the baseline
https://mmcheng.net/res2net/
36 stars 5 forks source link

The results reported in the paper are different from the ones in here. #2

Open ArchNew opened 4 years ago

ArchNew commented 4 years ago

In the paper, your model on pose estimation is trained and validated based on the Simple Baselines, and uses the same human detector results the Simple Baselines provided. However, according to the results you're reporting here. The results compared with the Simple Baselines are evaluated on the ground bounding box, not the ones with the human detector. But the Simple Baselines results you compare with in the paper is evaluated with the human detector, not the ground truth bounding box.

Since the results you reported in the paper is on par with the HRNet's results, it made me think that high-resolution feature maps are not that important after all. Unfortunately, when I want to do something based on the Simple Baselines modified by the Res2Net your paper proposed, I then found the fact that your model is nowhere near the HRNet.

gasvn commented 4 years ago

Sorry for the miss-leading, we have found this problem and have provide the right results in this repo. We will update our online version of the paper. Sorry again. Free feel to let me know if you have any other problem. HRNet has different training config with our model in this repo, maybe it can also be the reason for the gap between res2net and HRNet.

gasvn commented 4 years ago

Res2Net can be used to replace the basicblock in HRNet, someone told us that it would further improve the performance of HRNet.

ArchNew commented 4 years ago

Res2Net can be used to replace the basicblock in HRNet, someone told us that it would further improve the performance of HRNet.

No, it can't. Well, it depends on how you replace the basicblock. Given the HRNet has the same number of the input channels and output channels every branch (for the top branch, it is 32 in HRNet w32). If we don't change this premise, the channel expansion within the res2net module due to its natural structure won't help. Just expanding the channels inside the basic block to the corresponding number (between two convs), it performs better than replacing it with res2net module.

ArchNew commented 4 years ago

HRNet has different training config with our model in this repo, maybe it can also be the reason for the gap between res2net and HRNet. I did do an experiment on that. I'm sorry to report that, it won't close the over 2% gap between them. The "half-body-augmentation" of HRNet helps, but not that helpful (about 0.6~0.8 increase). As for more epochs...the Simple Baselines saturate sooner than the HRNet, more epochs have little impact. 5 more degree random rotation (from 40 to 45) won't help either.