Doubt regarding different initialization during training and testing

mks0601 / I2L-MeshNet_RELEASE

Official PyTorch implementation of "I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image", ECCV 2020

MIT License

710 stars 130 forks source link

Doubt regarding different initialization during training and testing #45

Closed Shubhendu-Jena closed 3 years ago

Shubhendu-Jena commented 3 years ago

Hi. Thank you for the great work! I have one doubt. While training, in model.py, the following block initializes weights :

if mode == 'train': pose_backbone.init_weights() pose_net.apply(init_weights) pose2feat.apply(init_weights) mesh_backbone.init_weights() mesh_net.apply(init_weights) param_regressor.apply(init_weights)

However, this isn't down during testing. Maybe this is a silly question but could you please explain why this is so? Because this makes quite a bit of difference in performance during testing.

Thanks in advance

mks0601 commented 3 years ago

In the testing stage, you load the pre-trained model, not randomly initialize the weights.

Shubhendu-Jena commented 3 years ago

Oh yes, of course. Makes sense. Thanks! :) Closing the issue.

Shubhendu-Jena commented 3 years ago

Actually, sorry but one more doubt. Basically, I use the model during training (i.e. mode = 'train') and use it to see testing performance after each epoch. What I am observing is that the testing performance at the last epoch using the train model (i.e. mode = 'train') is better than when I am testing with the test.py script which uses mode = 'test'. Again, apologies if the question is elementary but would be grateful if you could tell me a possible reason for the same.

mks0601 commented 3 years ago

I can't get your question. What do you mean by testing when mode == 'train'?

Shubhendu-Jena commented 3 years ago

I mean to say I use the model initialized during training to check the joint error metric values at regular intervals (50 steps or so). The joint error metric values I get when the training is almost ending (i.e. during the last few hundred steps or so) is quite a bit lesser compared to those that I get during testing (using the test.py script that loads the checkpoint after training is over). Do you have any idea why that might be so?

mks0601 commented 3 years ago

Did you test on the same chunk of dataset?

Shubhendu-Jena commented 3 years ago

Yeah, for both testing while during training and after training, I use the testing data given by the testset loader.

mks0601 commented 3 years ago

I think the only difference is that this. The eval mode fixes parameters of batch normalization layers, while they are changable in the training stage. Does the model in the training stage provides much better result than the model in the testing stage? That is weird because the eval mode is very common thing.

Shubhendu-Jena commented 3 years ago

Hi, thanks for the quick response. Indeed, that solved my problem. Admittedly, I had tried out some modifications in the model such as using group norm instead of batch norm. Maybe that was the problem? Regardless, will try to figure this out. Thanks again for the help. Closing this issue for now.