vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.4k stars 616 forks source link

Why converting observation space to np.float32? #438

Open jamartinh opened 10 months ago

jamartinh commented 10 months ago

Why converting observation space to np.float?

https://github.com/vwxyzjn/cleanrl/blob/329b128ea8a6afe76ce25d427c4ceba7276ad50e/cleanrl/sac_continuous_action.py#L205

This breaks compatibility with Gymnasium.

pseudo-rnd-thoughts commented 10 months ago

It shouldn't affect performance and reduces the memory overhead of the replay buffer

jamartinh commented 10 months ago

Ok, just to verify why I was getting warns with Gymnasium 1.0.0rc and assertion error in Minari.