vwxyzjn cleanrl issues - Githubissues

vwxyzjn / cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

http://docs.cleanrl.dev

Other

5.02k stars 575 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Target network isn't updated to the correct frequency when `target_network_frequency % train_frequency != 0`

#322 qgallouedec closed 1 year ago
0
Torchx integration

#321 vwxyzjn closed 1 year ago
2
Implement Gymnasium-compliant PPO script

#320 dtch1997 closed 1 year ago
19
Implement Gymnasium-compliant PPO

#319 dtch1997 closed 1 year ago
5
Implement Gymnasium-compliant PPO script

#318 dtch1997 closed 1 year ago
8
Benchmark `dqn_jax.py` using CPU only

#317 vwxyzjn closed 1 year ago
0
Update cleanrl-supported-papers-projects.md

#316 masud99r closed 1 year ago
2
v1.0.0 blog

#315 vwxyzjn closed 1 year ago
3
Prepare for v1.0.0 release

#314 vwxyzjn closed 1 year ago
1
Brax + PPO integration

#313 vwxyzjn opened 1 year ago
2
Remove unindented test script

#312 vwxyzjn closed 1 year ago
1
Handle truncation properly with PPO

#311 vwxyzjn closed 1 year ago
3
Why is there no design evaluation and save model module?

#310 madlsj opened 1 year ago
22
DDPG JAX breaks with python ~3.7

#309 vwxyzjn closed 8 months ago
2
Auto wandb tag with `benchmark.py`

#308 vwxyzjn closed 1 year ago
1
Prototype RLops Utility

#307 vwxyzjn closed 1 year ago
3
Proof-of-concept: Faster PyTorch

#306 DavidSlayback closed 1 year ago
5
unable to render video in gitpod

#305 tatakof closed 1 year ago
0
SAC Implementation Details

#304 araffin opened 1 year ago
0
cuda with SAC

#303 WillDudley closed 1 year ago
1
Depreciate legacy constructor so that cuda can be used

#302 WillDudley closed 1 year ago
2
Add a note on Gymnasium

#301 vwxyzjn closed 1 year ago
1
SAC jax

#300 araffin opened 1 year ago
17
fix: ddpg action bias

#299 sdpkjc closed 1 year ago
5
Stop adding action bias twice in DDPG jax

#298 joaogui1 closed 1 year ago
3
Action bias is added twice in DDPG algorithm implementation, similar to #259

#297 sdpkjc closed 1 year ago
0
RLops Guide

#296 vwxyzjn closed 1 year ago
1
Upload the ubuntu image for github action

#294 vwxyzjn closed 1 year ago
1
Type hints

#293 timoklein opened 1 year ago
1
Huggingface Integration

#292 vwxyzjn closed 1 year ago
22
Supported papers

#291 vwxyzjn closed 1 year ago
2
ppo+lstm train continuous environments

#290 1900360 closed 1 year ago
8
Re-benchmarking refactored algorithms

#289 jbuckman closed 1 year ago
1
Point offline RL users to tinkoff-ai/CORL

#288 vwxyzjn closed 1 year ago
6
Remove the unnecessary regular advantage code in PPO

#287 bragajj closed 1 year ago
3
V1.0.0b2

#286 vwxyzjn closed 1 year ago
1
TD3 jax fix

#285 joaogui1 closed 1 year ago
3
Update published paper citation

#284 vwxyzjn closed 1 year ago
1
Requirments - requirements-pettingzoo.txt

#283 Matkicail closed 1 year ago
3
Fix typos

#282 ALPH2H closed 1 year ago
1
TD3: fixed dimension of clipped_noise for target actions, added noise …

#281 dosssman closed 1 year ago
11
Problem with multi-agent atari

#280 Matkicail closed 1 year ago
2
TD3 policy noise bugs

#279 tomjur closed 1 year ago
2
Algorithm: Option Critic methods

#278 DavidSlayback opened 1 year ago
1
Update to support Gymnasium

#277 arjun-kg closed 1 year ago
14
Are you interested in PRs for improvements in performance of PPO LSTM script?

#276 thomasbbrunner opened 1 year ago
3
Bump oauthlib from 3.2.0 to 3.2.1 in /requirements

#275 dependabot[bot] closed 1 year ago
2
Bump oauthlib from 3.2.0 to 3.2.1

#274 dependabot[bot] closed 1 year ago
2
Bump mako from 1.2.1 to 1.2.2

#273 dependabot[bot] closed 1 year ago
2
Draft: DroQ and TD3+TQC jax implementation

#272 araffin opened 1 year ago
6

Previous Next