[Question] Control PPO training

❓ Question

Hi I wanted to inquire is there a way to control the training of the MaskablePPO when it finishes the data set that is to be trained on (when ep_len_mean reaches 1 and ep_rew_mean reaches 100 in the logger) or at any specific point during the learning?

Checklist

[X] I have checked that there is no similar issue in the repo
[X] I have read the documentation
[X] If code there is, it is minimal and working
[X] If code there is, it is formatted using the markdown code blocks for both code and stack traces.

DLR-RM / stable-baselines3

[Question] Control PPO training #1872

❓ Question

Checklist