hellojialee / Improved-Body-Parts

Simple Pose: Rethinking and Improving a Bottom-up Approach for Multi-Person Pose Estimation
https://arxiv.org/abs/1911.10529
258 stars 42 forks source link

l2 focal loss is diffirent from paper #16

Closed VenAlone closed 4 years ago

VenAlone commented 4 years ago

So glad to see your project, I successfully run the demo.But i found that the l2 focal loss in this project (models/loss_model_parallel.py), set the alpha=0 and beta=0, factor = torch.abs(1.- st), which is different from your paper shows, alpha=0.1, beta=0.02 and gamma=2, factor = (1. - st) **gamma.I'm really confused about that. I really hope to get your help, thank you very much.

hellojialee commented 4 years ago

Hi VenAlone, thank you for your interest.

To be short, you can train the model with the focal L2 loss in (models/loss_model.py) from scratch without pretaining using normal L2 loss. It's my fault to include too much junk experiments but I only do all paper and work alone. Our older project uses gamma=2 only (continue to train from L2 loss checkpoint model), please have a check here: https://github.com/jialee93/Multi-Person-Pose-using-Body-Parts/blob/69b648cdd526849020c28afe6dbc4154459ffdab/training/train_common.py#L132

As the papers says, we continue to train the model using focal L2 loss. But I found that we can get the same results with gamma=1 (i.e. factor = torch.abs(1.- st)) from scratch. The gamma=2 is used in original focal loss paper. I didn't do ablation experiments to compare the gamma=1 or 2 and thus I didn't mention it in the paper. I can not go back to school due to the COVID-19 virus and I will correct my self if I compare them.

And alpha=0.1, beta=0.02 contributes only a little (about 0.3* %AP) and you can set them to 0 at the beginning. If gamma=1, and alpha=1, beta=1, then the focal L2 loss works the same as normal l2 loss.

Feel free to discuss.

hellojialee commented 4 years ago

Hi, focal L2 loss with gamma=2 can easily further brings about 0.4% AP increase on COCO test-dev dataset compared with gamma=1. But more false positives will appear. For practical application, I still suggest set gamma=1 for now (more robust).

VenAlone commented 4 years ago

Thanks a lot for your time and advice.I have got 0.02 mAP increase on COCO val2017 dataset with alpha=0.1, beta=0.02, gamma=1 in my experiment compare with MSE loss.

hellojialee commented 4 years ago

Hi, 0.02 mAP or 0.02% mAP? alpha, beta are not as important as Gaussian sigma, gamma and thre.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.