vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.54k stars 631 forks source link

Upgrade gym version to 0.26.1 #263

Closed AdityaGudimella closed 10 months ago

AdityaGudimella commented 2 years ago

Problem Description

Upgrade gym version used in cleanrl from 0.23.1 to 0.25.1

Checklist

Possible Solution

StabeBaselines' ReplayBuffer currently does not support the new format returned by gym.Env.step. Their step api changed from: obs, rew, done, info = env.step(action) to obs, rew, terminated, truncated, info = env.step(action). We would need to implement a slightly modified version of the ReplayBuffer in cleanRL itself. Other than this, the changes required are minimal.

I can submit an initial PR with changes required for SAC if you're interested.

vwxyzjn commented 2 years ago

Update on the ticket - the current gym master is set to release 0.26.0 which enables obs, rew, terminated, truncated, info = env.step(action) by default.

vwxyzjn commented 10 months ago

424 closes this issue to use gymnasium.