Recommended by @manjavacas, the flags terminated and truncated should be reformulated in Sinergym in order to be used correctly with SB3 algorithms.
Enhanced behavior
truncated would be false by default and true when the simulation time limit is reached (what the terminated does right now).
terminated would be false by default and true when a specific condition happens (to be implemented by the user) is met. For example: deviating N degrees from the comfort limits).
Additional context
For now, mechanisms for early-stopping of episodes under specific conditions will not be implemented. But it is an idea that could be implemented in the future.
Checklist
[x] I have checked that there is no similar issue in the repo (required)
Improvement 🔧
Recommended by @manjavacas, the flags
terminated
andtruncated
should be reformulated in Sinergym in order to be used correctly with SB3 algorithms.Enhanced behavior
Additional context
For now, mechanisms for early-stopping of episodes under specific conditions will not be implemented. But it is an idea that could be implemented in the future.
Checklist
:pencil: Please, don't forget to include more labels besides
enhancement
if it is necessary.