Closed JyunYuLai closed 4 years ago
Hi, @JyunYuLai, thank you for your interest.
Answer1: I continue to train the network with focal l2 loss is just to save time, because I want to compare the focal l2 loss with the normal l2 loss. In my situation, it is OK to train the network with focal l2 loss from scratch if the training converges, thus I use warm-up learning rate at first (gaussian weight initialization leads the initial heatmap values equaling 0, i.e. regarding the whole image as background and thus the focal team squashes the loss of background significantly). Recently, I retrain my system with HRNet backbone (pertained) and the focal l2 loss still brings about 3% AP increase (multi-scale testing is used) compared with l2 loss. Keep it in mind that the focal l2 loss is applied simultaneously to the body part and keypoint heatmap in our work. As mentioned in our paper: we recommend to data mining to keypoint and keypoint connection.
Answer2: It could be sensitive to the hyper-parameter thre (0.01 in our case). Please insure that the sigma and the area of gaussian peak are proper w.r.t the size of the heatmap for loss balance.
Hi,
Thank you for sharing this great work. I have implemented focal l2 loss but unfortunately didn't get better results compared to normal l2 loss. Here are some questions about focal l2 loss.