Added TQC - Githubissues

AdityaGudimella commented 2 years ago

Description

Closes #258. Implement Truncated Quantile Critics

Types of changes

[ ] Bug fix
[ ] New feature
[x] New algorithm
[ ] Documentation

Checklist:

[x] I've read the CONTRIBUTION guide (required).
[x] I have ensured pre-commit run --all-files passes (required).
[ ] I have updated the documentation and previewed the changes via mkdocs serve.
[ ] I have updated the tests accordingly (if applicable).

If you are adding new algorithms or your change could result in performance difference, you may need to (re-)run tracked experiments. See https://github.com/vwxyzjn/cleanrl/pull/137 as an example PR.

[ ] I have contacted vwxyzjn to obtain access to the openrlbenchmark W&B team (required).
[ ] I have tracked applicable experiments in openrlbenchmark/cleanrl with --capture-video flag toggled on (required).
[ ] I have added additional documentation and previewed the changes via mkdocs serve.
- [ ] I have explained note-worthy implementation details.
- [ ] I have explained the logged metrics.
- [ ] I have added links to the original paper and related papers (if applicable).
- [ ] I have added links to the PR related to the algorithm.
- [ ] I have created a table comparing my results against those from reputable sources (i.e., the original paper or other reference implementation).
- [ ] I have added the learning curves (in PNG format with width=500 and height=300).
- [ ] I have added links to the tracked experiments.
- [ ] I have updated the overview sections at the docs and the repo
[ ] I have updated the tests accordingly (if applicable).

vercel[bot] commented 2 years ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated
cleanrl	✅ Ready (Inspect)	Visit Preview	Aug 26, 2022 at 4:56PM (UTC)

AdityaGudimella commented 2 years ago

I can run the algo against the same mujoco envs as run in the paper. Would once against each env be sufficient? Also how do I share the results of the run with you?

vwxyzjn commented 2 years ago

Thank you @AdityaGudimella. The variant looks good. I suggest running some preliminary experiments in your own wandb namespace and create a wandb report to share the findings.

AdityaGudimella commented 2 years ago

Apologies for the delay in this. I ran 2 trails each of Hopper-v3, Humanoid-v3 and Swimmer-v3 and didn't have any resources available to run experiments after that. I've just set 2 trials each of HalfCheetah-v3, Ant-v3 and Walker2d-v3 now. Once those experiments are done (probably in 2 days) I will share the report here.

vwxyzjn commented 1 year ago

That sounds good. Feel free to ping me if this is ready for review or if there is anything I can help.

vwxyzjn / cleanrl

Added TQC #262

Description

Types of changes

Checklist: