Open mfe7 opened 6 years ago
@mfe7 Hi,
I have a simple version to load checkpoint for ppo2. Here is my code
@mfe7 Hi,
I have a simple version to load checkpoint for ppo2. Here is my code
Hi Sir,
I am still struggling to figure this out. I trained a policy using PPO2 and used the --save_path argument to save checkpoints. But now I am lost. How do I use this trained policy?
Any help would be appreciated.
For those looking the answer here is a good notebook with examples and it includes link to medium article explaining the same https://colab.research.google.com/drive/1KoAQ1C_BNtGV3sVvZCnNZaER9rstmy0s
First of all, thank you for providing these great baselines!
I can train the policies for the various algorithms (PPO1/PPO2/TRPO) and see that average reward increases and loss decreases, but is there a straightforward way to then simply evaluate the learned policy on an environment with a bunch of different initial conditions?
In PPO2, there's a way to save the model checkpoint, but is there any documentation about how to load that checkpoint? If not, do you have any suggestions on how best I might add this?
This functionality seems to be included in deepq, where there is a train.py and enjoy.py for sample environments, and it seems simpler in that the training procedure generates a single .pkl file that can be loaded by the enjoy procedure in one line.