issues
search
vwxyzjn
/
cleanrl
High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)
http://docs.cleanrl.dev
Other
5.02k
stars
575
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Target network isn't updated to the correct frequency when `target_network_frequency % train_frequency != 0`
#322
qgallouedec
closed
1 year ago
0
Torchx integration
#321
vwxyzjn
closed
1 year ago
2
Implement Gymnasium-compliant PPO script
#320
dtch1997
closed
1 year ago
19
Implement Gymnasium-compliant PPO
#319
dtch1997
closed
1 year ago
5
Implement Gymnasium-compliant PPO script
#318
dtch1997
closed
1 year ago
8
Benchmark `dqn_jax.py` using CPU only
#317
vwxyzjn
closed
1 year ago
0
Update cleanrl-supported-papers-projects.md
#316
masud99r
closed
1 year ago
2
v1.0.0 blog
#315
vwxyzjn
closed
1 year ago
3
Prepare for v1.0.0 release
#314
vwxyzjn
closed
1 year ago
1
Brax + PPO integration
#313
vwxyzjn
opened
1 year ago
2
Remove unindented test script
#312
vwxyzjn
closed
1 year ago
1
Handle truncation properly with PPO
#311
vwxyzjn
closed
1 year ago
3
Why is there no design evaluation and save model module?
#310
madlsj
opened
1 year ago
22
DDPG JAX breaks with python ~3.7
#309
vwxyzjn
closed
8 months ago
2
Auto wandb tag with `benchmark.py`
#308
vwxyzjn
closed
1 year ago
1
Prototype RLops Utility
#307
vwxyzjn
closed
1 year ago
3
Proof-of-concept: Faster PyTorch
#306
DavidSlayback
closed
1 year ago
5
unable to render video in gitpod
#305
tatakof
closed
1 year ago
0
SAC Implementation Details
#304
araffin
opened
1 year ago
0
cuda with SAC
#303
WillDudley
closed
1 year ago
1
Depreciate legacy constructor so that cuda can be used
#302
WillDudley
closed
1 year ago
2
Add a note on Gymnasium
#301
vwxyzjn
closed
1 year ago
1
SAC jax
#300
araffin
opened
1 year ago
17
fix: ddpg action bias
#299
sdpkjc
closed
1 year ago
5
Stop adding action bias twice in DDPG jax
#298
joaogui1
closed
1 year ago
3
Action bias is added twice in DDPG algorithm implementation, similar to #259
#297
sdpkjc
closed
1 year ago
0
RLops Guide
#296
vwxyzjn
closed
1 year ago
1
Upload the ubuntu image for github action
#294
vwxyzjn
closed
1 year ago
1
Type hints
#293
timoklein
opened
1 year ago
1
Huggingface Integration
#292
vwxyzjn
closed
1 year ago
22
Supported papers
#291
vwxyzjn
closed
1 year ago
2
ppo+lstm train continuous environments
#290
1900360
closed
1 year ago
8
Re-benchmarking refactored algorithms
#289
jbuckman
closed
1 year ago
1
Point offline RL users to tinkoff-ai/CORL
#288
vwxyzjn
closed
1 year ago
6
Remove the unnecessary regular advantage code in PPO
#287
bragajj
closed
1 year ago
3
V1.0.0b2
#286
vwxyzjn
closed
1 year ago
1
TD3 jax fix
#285
joaogui1
closed
1 year ago
3
Update published paper citation
#284
vwxyzjn
closed
1 year ago
1
Requirments - requirements-pettingzoo.txt
#283
Matkicail
closed
1 year ago
3
Fix typos
#282
ALPH2H
closed
1 year ago
1
TD3: fixed dimension of clipped_noise for target actions, added noise …
#281
dosssman
closed
1 year ago
11
Problem with multi-agent atari
#280
Matkicail
closed
1 year ago
2
TD3 policy noise bugs
#279
tomjur
closed
1 year ago
2
Algorithm: Option Critic methods
#278
DavidSlayback
opened
1 year ago
1
Update to support Gymnasium
#277
arjun-kg
closed
1 year ago
14
Are you interested in PRs for improvements in performance of PPO LSTM script?
#276
thomasbbrunner
opened
1 year ago
3
Bump oauthlib from 3.2.0 to 3.2.1 in /requirements
#275
dependabot[bot]
closed
1 year ago
2
Bump oauthlib from 3.2.0 to 3.2.1
#274
dependabot[bot]
closed
1 year ago
2
Bump mako from 1.2.1 to 1.2.2
#273
dependabot[bot]
closed
1 year ago
2
Draft: DroQ and TD3+TQC jax implementation
#272
araffin
opened
1 year ago
6
Previous
Next