Closed mensaochun closed 5 years ago
Hi @mensaochun, I assume you are referring to the WIDER dataset. If that is the case then the first thing I would suggest would be training a ResNet-101 and making sure you can obtain the ~83% mAP. To do that make sure you have a learning rate scheduler, proper data-augmentation as well as the right optimizer (SGD w/ momentum). The supplementary material here can be useful. Once you achieve that then start adding the different bits and pieces proposed in this work and track the improvements of the performance. The mAP should get up to 86% on the test set but feel free to follow up should you face any obstacles.
I make every configuration as what you said in that instruction file you mentioned, however only get a map ~80.5 on the fisrt step of freezing the other part and training ResNet101 only.
Hi @zx3Leonoardo
Thank you for your interest in our work. I wrote in another question some steps that I think can get you to a very good performance given a ResNet (check steps 1-3 here)
The person in that thread managed to actually get better results than what we report on the paper which is what I would also expect using a ResNet-101 on WIDER. If I had to guess it's not learning well (either because of the learning rate or because of the input data). Please take a look at the input size + data augmentation and play a little more with training and then get back to me. If it's still not getting anywhere close to 83% of mAP please give me some more details of what you've tried and I'll get back to you.
thanks a lot for reply @nsarafianos I make it to ~83 when first train the Res101 backbone. And then I fix the backbone and fine tune the attention modules which can achieve ~84.7. Finally, I unfrozen backbone and train the two parts simultaneously. However I got a lower mAP ~82. So I want to make sure that my training order is right. And did you get the same mAP ~84.7 before train the two parts?
Hi @zx3Leonoardo
I hope this helps.
Thank you very much. your suggestion helps a lot~ However, I am still a little confused about the training process. I trained the res101 with the configuration you mentioned in that file and achieved map 83.3. And then using the res101 pretrained model as initial model, I changed the bce loss into weighted& focal loss and got map ~83.38. In the fine-tune process I also lowered the lr, which helps nothing. I also tried to train the res101 as well as weighted& focal loss from the beginning, and got map ~83.00. There is no improvement in these 2 ways. So I really want to know the right steps to train the network. Should I train different parts one after one, just like 1.train res101 with bce 2.train res101 with w&f loss which take the result model of (1) as initial model. 3.train network with attention module which take the result model of (2) as initial. or I should take all elements as whole part, for example we call it B. And we train res101 first, and add B the network to fine-tune. I don't know if I make a precise description on my problem. I really appreciate that you are so patient on the project and my problem.
Hi Zx3Leonoardo, Can you share your code which reproduce to this paper? Thank you very much!!
I have run the training code according to your instructions, and after training about 30 epochs, the mAP is 0.78, and after 70 epochs, the mAP is also about 0.78. It can not be compared with the mAP=0.86 which is reported in the original paper. So May I ask you how to get the performance in the paper?