vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.4k stars 616 forks source link

[BUG] Env does not reset when it's terminated #432

Closed modanesh closed 10 months ago

modanesh commented 10 months ago

Problem Description

Checklist

Current Behavior

Whenever the environment is terminated or truncated, it should be reset, but that doesn't happen in a few implementations. For example, in this for loop (https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/sac_continuous_action.py#L221), we should check whether the envs is terminated/truncated or not, and reset it accordingly.

Howuhh commented 10 months ago

These are vectorized environments with the help of gym.vector.SyncVectorEnv. They will auto-reset on truncation or termination. This behaviour is described in Gymnasium documentation.

vwxyzjn commented 10 months ago

Yeah see "To prevent terminated environments waiting until all sub-environments have terminated or truncated, the vector environments autoreset sub-environments after they terminate or truncated. As a result, the final step’s observation and info are overwritten by the reset’s observation and info."

Closing the issue then since this is a non-issue.