Open huacong opened 3 years ago
@huacong,
You can see that your results obtained from 75 epochs are much better than the model trained with 25 epoches.
I also uploaded the checkpoint file in this repo and it shows that I trained with the model with 55000 iterations, corresponding to 550 epochs. You can reproduce our results by training the model much longer.
I try to train the model following your code. It's a pity that I can't get good performance until I get to 75 epochs. After a training epoch, I will test it. I never get the results in paper. The batch_size = 96, ce_loss weight is 1,center_loss weight is 1, mse_loss weight is 0.1. When testing and training, the image is fixed to 2. It's so strange. When I test it,the best results are below. image2image mAP: 74.33 point2image mAP: 71.61 mesh2image mAP: 78.96 point2point mAP: 77.38 image2point mAP: 70.8 mesh2point mAP:80.06 mesh2mesh mAP:85.91 image2mesh mAP:73.3 point2mesh mAP:77.99 Why I stop trainning at 75 epochs? At 76 epochs, the loss curve stops falling.The loss is nan, maybe gradient explosion. I think it's a good choice to stop training. It's a pity that I didn't get good results. Could you please share me some training experiences? Thanks very much.