chenyilun95 / tf-cpn

Cascaded Pyramid Network for Multi-Person Pose Estimation (CVPR 2018)
MIT License
792 stars 197 forks source link

How about training from scratch? #24

Open kaleidoscopical opened 6 years ago

kaleidoscopical commented 6 years ago

Hi! Thanks for providing such a wonderful work. I wonder have you tried a ResNet backbone without ImageNet pretraining? Is it possible that a pre-trained model might become one of the keys of the performance improvement?

chenyilun95 commented 6 years ago

I dont tried it. But I think it should hurt the performance. But Google also use the resnet with dilation in coco2016 competition. The performance is also compared in the table in our Paper. Our improvement is based on their result.

kaleidoscopical commented 6 years ago

Thanks for your kind reply. But, sorry for my curiosity. I still have questions.

I think the result from Google may not be strictly comparable here. Their detection part is too weak and even without an FPN. Although better detection does not lead to better keypoint estimation, their result is even less than the FPN baseline. Besides, there are still several factors that make the comparison unfair, e.g. iteration of training, input size, and training strategy.

May it be fairer that the comparison is made between a no pre-trained CPN(ResNet-50) and a 2-stage hourglass model? In this case, they have same FLOPs and no other prior information.

chenyilun95 commented 6 years ago
  1. Whether pre-trained ResNet helps or not, the improvement above it is our contribution.
  2. We compare our model with ResNet and hourglass network under same training configuration in our paper.
  3. Some other guy has tried training from stratch. He said the result was almost same. That's my all comments. Thank you.