vwxyzjn cleanrl issues - Githubissues

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

http://docs.cleanrl.dev

Other

5.54k stars 631 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

DQN + Atari + JAX

#231 yooceii closed 2 years ago
5
Dqn atari jax

#230 yooceii closed 2 years ago
1
JAX + DDPG docs fix

#229 vwxyzjn closed 2 years ago
1
Hyperparameter optimization

#228 vwxyzjn closed 2 years ago
11
PPO + JAX + EnvPool + Atari

#227 vwxyzjn closed 2 years ago
5
Bump ujson from 5.1.0 to 5.4.0

#226 dependabot[bot] closed 2 years ago
2
JAX TD3 prototype

#225 joaogui1 closed 2 years ago
3
Jax c51 contrib

#224 kinalmehta closed 1 year ago
9
single env PPO impl

#223 gauravkuppa closed 2 years ago
2
prototype jax with dqn

#222 kinalmehta closed 2 years ago
19
JAX + C51

#221 vwxyzjn closed 1 year ago
0
JAX + DQN

#220 vwxyzjn closed 2 years ago
1
JAX + TD3

#219 vwxyzjn closed 2 years ago
1
JAX Integration with CleanRL

#218 vwxyzjn closed 1 year ago
0
PPO + JAX + EnvPool + MuJoCo

#217 vwxyzjn opened 2 years ago
15
Prototype TD3 with JAX

#216 vwxyzjn closed 2 years ago
1
PPO with Humanoid

#215 AdityaGudimella closed 2 years ago
2
Remove pettingzoo's pistonball example

#214 vwxyzjn closed 2 years ago
1
Fix documentation link

#213 vwxyzjn closed 2 years ago
1
Average PPO implementation

#212 Howuhh closed 1 year ago
31
Td3 ddpg action bound fix

#211 dosssman closed 2 years ago
11
Adding Average Reward PPO proposal

#210 Howuhh closed 10 months ago
3
added gamma to reward normalization wrappers

#209 Howuhh closed 2 years ago
10
Remove the value function clipping

#208 vwxyzjn closed 10 months ago
0
Removing the regular advantage calculation in PPO

#207 vwxyzjn closed 2 years ago
2
PPO improvements

#206 vwxyzjn closed 10 months ago
0
Show correct exception cause

#205 cool-RR closed 2 years ago
2
ppo with timeout handling

#204 Howuhh closed 1 year ago
13
PPO reward normalization works only for default gamma

#203 Howuhh closed 2 years ago
3
License issues

#202 vwxyzjn closed 10 months ago
3
Add license scan report and status

#201 fossabot closed 2 years ago
2
Clarify CleanRL is a non-modular library

#200 vwxyzjn closed 2 years ago
3
Add a note on PPG's performance

#199 vwxyzjn closed 2 years ago
2
PPO timeout proper handling

#198 Howuhh opened 2 years ago
12
CleanRL can't be used by importing?

#197 cool-RR closed 2 years ago
2
DDPG/TD3 target_actor output clip

#196 huxiao09 closed 2 years ago
19
Bump cookiecutter from 1.7.3 to 2.1.1

#195 dependabot[bot] closed 2 years ago
2
1.0.0 Beta Release

#194 vwxyzjn closed 2 years ago
1
Fix the implemented varaints section in PPO

#193 vwxyzjn closed 2 years ago
1
Fix documentation links in README.md

#192 vwxyzjn closed 2 years ago
1
Broken links in readme

#191 cool-RR closed 2 years ago
2
Temporarily Remove PPO-RND

#190 vwxyzjn closed 2 years ago
1
Improve documentation and contribution guide

#189 vwxyzjn closed 2 years ago
2
Support Pettingzoo Multi-agent Atari envs with PPO

#188 vwxyzjn closed 2 years ago
7
prototype jax with ddpg

#187 vwxyzjn closed 2 years ago
8
Match PPG implementation

#186 dipamc closed 2 years ago
3
[Request] Dexterous Manipulation Benchmark

#185 kevinzakka closed 2 years ago
8
Adopt or not to adopt tensorboard native hyperparameters recording

#184 vwxyzjn opened 2 years ago
1
Add pre-commit for Markdown formatting

#183 vwxyzjn opened 2 years ago
0
multi-gpu implementation for PPO

#182 hlsfin closed 2 years ago
1

Previous Next