issues
search
vwxyzjn
/
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.54k
stars
631
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
DQN + Atari + JAX
#231
yooceii
closed
2 years ago
5
Dqn atari jax
#230
yooceii
closed
2 years ago
1
JAX + DDPG docs fix
#229
vwxyzjn
closed
2 years ago
1
Hyperparameter optimization
#228
vwxyzjn
closed
2 years ago
11
PPO + JAX + EnvPool + Atari
#227
vwxyzjn
closed
2 years ago
5
Bump ujson from 5.1.0 to 5.4.0
#226
dependabot[bot]
closed
2 years ago
2
JAX TD3 prototype
#225
joaogui1
closed
2 years ago
3
Jax c51 contrib
#224
kinalmehta
closed
1 year ago
9
single env PPO impl
#223
gauravkuppa
closed
2 years ago
2
prototype jax with dqn
#222
kinalmehta
closed
2 years ago
19
JAX + C51
#221
vwxyzjn
closed
1 year ago
0
JAX + DQN
#220
vwxyzjn
closed
2 years ago
1
JAX + TD3
#219
vwxyzjn
closed
2 years ago
1
JAX Integration with CleanRL
#218
vwxyzjn
closed
1 year ago
0
PPO + JAX + EnvPool + MuJoCo
#217
vwxyzjn
opened
2 years ago
15
Prototype TD3 with JAX
#216
vwxyzjn
closed
2 years ago
1
PPO with Humanoid
#215
AdityaGudimella
closed
2 years ago
2
Remove pettingzoo's pistonball example
#214
vwxyzjn
closed
2 years ago
1
Fix documentation link
#213
vwxyzjn
closed
2 years ago
1
Average PPO implementation
#212
Howuhh
closed
1 year ago
31
Td3 ddpg action bound fix
#211
dosssman
closed
2 years ago
11
Adding Average Reward PPO proposal
#210
Howuhh
closed
10 months ago
3
added gamma to reward normalization wrappers
#209
Howuhh
closed
2 years ago
10
Remove the value function clipping
#208
vwxyzjn
closed
10 months ago
0
Removing the regular advantage calculation in PPO
#207
vwxyzjn
closed
2 years ago
2
PPO improvements
#206
vwxyzjn
closed
10 months ago
0
Show correct exception cause
#205
cool-RR
closed
2 years ago
2
ppo with timeout handling
#204
Howuhh
closed
1 year ago
13
PPO reward normalization works only for default gamma
#203
Howuhh
closed
2 years ago
3
License issues
#202
vwxyzjn
closed
10 months ago
3
Add license scan report and status
#201
fossabot
closed
2 years ago
2
Clarify CleanRL is a non-modular library
#200
vwxyzjn
closed
2 years ago
3
Add a note on PPG's performance
#199
vwxyzjn
closed
2 years ago
2
PPO timeout proper handling
#198
Howuhh
opened
2 years ago
12
CleanRL can't be used by importing?
#197
cool-RR
closed
2 years ago
2
DDPG/TD3 target_actor output clip
#196
huxiao09
closed
2 years ago
19
Bump cookiecutter from 1.7.3 to 2.1.1
#195
dependabot[bot]
closed
2 years ago
2
1.0.0 Beta Release
#194
vwxyzjn
closed
2 years ago
1
Fix the implemented varaints section in PPO
#193
vwxyzjn
closed
2 years ago
1
Fix documentation links in README.md
#192
vwxyzjn
closed
2 years ago
1
Broken links in readme
#191
cool-RR
closed
2 years ago
2
Temporarily Remove PPO-RND
#190
vwxyzjn
closed
2 years ago
1
Improve documentation and contribution guide
#189
vwxyzjn
closed
2 years ago
2
Support Pettingzoo Multi-agent Atari envs with PPO
#188
vwxyzjn
closed
2 years ago
7
prototype jax with ddpg
#187
vwxyzjn
closed
2 years ago
8
Match PPG implementation
#186
dipamc
closed
2 years ago
3
[Request] Dexterous Manipulation Benchmark
#185
kevinzakka
closed
2 years ago
8
Adopt or not to adopt tensorboard native hyperparameters recording
#184
vwxyzjn
opened
2 years ago
1
Add pre-commit for Markdown formatting
#183
vwxyzjn
opened
2 years ago
0
multi-gpu implementation for PPO
#182
hlsfin
closed
2 years ago
1
Previous
Next