Is there a way to run the training and evaluation for a certain number of time-steps? Currently the PPO examples in both the first and second edition run forever.
Hi! You can always press Ctrl-C on the console to interrupt the optimisation.
If you want to do this programmatically, it will be just inside the training loop:
Is there a way to run the training and evaluation for a certain number of time-steps? Currently the PPO examples in both the first and second edition run forever.