Base classes for Off Policy Agents

SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

https://genrl.readthedocs.io

MIT License

403 stars 59 forks source link

Base classes for Off Policy Agents #169

Closed Sharad24 closed 4 years ago

Sharad24 commented 4 years ago

This should make the code much more comprehensible, especially with the number of arguments we have. And at the same time resolve a lot of maintainability issues.

sampreet-arthi commented 4 years ago

There's also a lot of code duplication issues that show up pylint. Maybe work on that too.

Sharad24 commented 4 years ago

Yup, that's the goal. Another thing to do would be properly decide the parameters to be kept/removed in agents and added/removed from the trainers.

sampreet-arthi commented 4 years ago

Since we already have the On Policy Agents, I'm renaming this to Off Policy. I'll raise a PR for this soon.

sampreet-arthi commented 4 years ago

I'm thinking of refactoring each of the individual off policy algorithms first so that the code is neater, more uniform and shorter.

sampreet-arthi commented 4 years ago

To-do:

[x] Refactor DDPG
[x] Refactor TD3
[x] Refactor SAC
[x] Finalise BaseAgent and Base OffPolicyAgent classes
[ ] Refactor Trainer and OffPolicyTrainer (Trainer classes are also really long, would be a good idea to first shorten them then maybe we can separate them into multiple files if they're still too big)
[ ] Add CUDA support for all of them
[ ] Add support for Prioritized Experience Replay for all Off Policy algos

Sharad24 commented 4 years ago

Tracking in separate issues now. #263, #162 and #264