Test During Training without self.net.eval()

Heyula08 commented 3 years ago

Hi, firstly amazing work, congrats! However, while checking out the repo, I realized that you are not calling eval() at the end of a training epoch just before calling (or inside) test_epoch() function. After training the model with default parameters, I observed that when I run trainval.py in "test" phase, which by the way requires calling load_model() function you wrote but didn't use neither inside test() nor test_epoch(), ADE and FDE are different than what is logged during the training. I think a simple self.net.eval() line will be enough inside test_epoch().

Additionally, I am not sure why but in test(), self.net.eval() is called before loading the model. This might not be doing what you expect. In general, I see that it is called after loading the saved model.

cunjunyu commented 3 years ago

Thank you for pointing them out.

We actually did not provide a separate version of 'test'. That is being said, the 'test phase' is not working now. That is the reason why 'load_model()' is not included in either of these two functions.

Because this repo borrows heavily from SR-LSTM, we keep the 'test phase' of the trainer. we will fix it up asap.

Have a nice day !

Yusufma03 commented 3 years ago

Hi, thanks a lot for spotting this issue.

We found that the problem was caused by a misuse of dropout when we cleaned out code for release. It has been fixed and tested now.

Note that after training, when we load the model back and run evaluation again, the result might be slightly different. This is because, during training phase and test phase, different random seeds are used.

Do let us know if you have more questions.

cunjunyu / STAR

Test During Training without self.net.eval() #4