vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.26k stars 602 forks source link

Handle truncation properly with PPO #311

Closed vwxyzjn closed 1 year ago

vwxyzjn commented 1 year ago

Description

This PR is a prototype for #198 and https://github.com/sail-sg/envpool/issues/194

Types of changes

Checklist:

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See https://github.com/vwxyzjn/cleanrl/pull/137 as an example PR.

vercel[bot] commented 1 year ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
cleanrl ✅ Ready (Inspect) Visit Preview Nov 12, 2022 at 1:55AM (UTC)
vwxyzjn commented 1 year ago

Due to the lack of performance improvement, temporarily closing this PR.

KaleabTessera commented 10 months ago

Shouldn't this use terminal_observation to handle trunc correctly?