Closed anushmanukyan closed 3 years ago
@anushmanukyan Could I know which environment and algorithm you are training for?
@TianhongDai I am using PPO.
@anushmanukyan I guess you just load the weights of the model. if you check the line here: https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L113 When I test the network, I also load the object of running mean filter
. Because during training , I use the running mean filter
to normalize the input. So, if you want to retrain your model, you should also load the "trained" running mean filter
. Otherwise you will get different result.
I added running mean filter
and retraining seems to work better now.
However I have another question: how the demo.py works? Basically I can not figure out how the testing works, since I save the best model, but then when i test this model it has different reward than it had while saving that model. How it can be possible? And also if I run several times the same model then I get different performance.
Thank you so much for your help.
@anushmanukyan Hi, I think demo.py
should work fine, you can download my pre-trained model from: https://drive.google.com/drive/u/2/folders/1cZjjCA5WHs-Lfw63ntzeUjMo_wZoIgXw Then, just run python demo.py
. It will still get same high scores as it get during training. You can check https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L111 here to see how did i test the network.
I'm trying to retrain the saved model, but it behaves very strangely:
I guess this is pytorch issue, but maybe you've succeeded the retraining, and might know how should it be done?
Saving:
Loading saved model:
Thanks a lot in advance