DQN + Atari + JAX - Githubissues

yooceii commented 2 years ago

Description

Types of changes

[ ] Bug fix
[x] New feature
[ ] New algorithm
[ ] Documentation

Checklist:

[x] I've read the CONTRIBUTION guide (required).
[x] I have ensured pre-commit run --all-files passes (required).
[ ] I have updated the documentation and previewed the changes via mkdocs serve.
[ ] I have updated the tests accordingly (if applicable).

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See https://github.com/vwxyzjn/cleanrl/pull/137 as an example PR.

[x] I have contacted vwxyzjn to obtain access to the openrlbenchmark W&B team (required).
[ ] I have tracked applicable experiments in openrlbenchmark/cleanrl with --capture-video flag toggled on (required).
[ ] I have added additional documentation and previewed the changes via mkdocs serve.
- [ ] I have explained note-worthy implementation details.
- [ ] I have explained the logged metrics.
- [ ] I have added links to the original paper and related papers (if applicable).
- [ ] I have added links to the PR related to the algorithm.
- [x] I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- [x] I have added the learning curves (in PNG format with width=500 and height=300).
- [x] I have added links to the tracked experiments.
[ ] I have updated the tests accordingly (if applicable).

vercel[bot] commented 2 years ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
cleanrl	✅ Ready (Inspect)	Visit Preview	Jul 26, 2022 at 6:09AM (UTC)

yooceii commented 2 years ago

JAX gains roughly 40% SPS increase with same epi_ret performance.

Full report here: https://wandb.ai/yooceii/dqn-atari-jax/reports/DQN-JAX--VmlldzoyMjkzMTM2

yooceii commented 2 years ago

Looks like combining linear_schedule and select_action and jitting do get a little better performance. Now it gains roughly 50% more SPS. @vwxyzjn Wandb report is also updated.

vwxyzjn commented 2 years ago

This is awesome work. Thank you! @kinalmehta would you mind including the jitted action sampling function to #222? I think this will be the last thing before we merge. Since this is a non-breaking change for #222, we don't need to re-run the benchmark.

vwxyzjn commented 2 years ago

Closed in favor of https://github.com/vwxyzjn/cleanrl/pull/222

vwxyzjn / cleanrl

DQN + Atari + JAX #231

Description

Types of changes

Checklist: