SforAiDl / genrl

A PyTorch reinforcement learning library for generalizable and reproducible algorithm implementations with an aim to improve accessibility in RL

https://genrl.readthedocs.io

MIT License

404 stars 59 forks source link

[WIP] Added BCQ #378

Open sampreet-arthi opened 3 years ago

sampreet-arthi commented 3 years ago

Stuff implemented:

Added BCQ under genrl/agents/offline
BCQ inherits from OffPolicyAgentAC. Architecture was very similar to TD3. Major differences were that the actor took in both state and action as input and the VAE obviously.
OfflineTrainer inherits from OffPolicyTrainer. Only difference is that it loads the buffer.
Refactored buffers and rollouts to inherit from BaseBuffer and remove redundant functions and converted all code to torch. No numpy is used in any of the buffer files now.

Stuff to do:

[x] Haven't tested properly yet if it works. Currently created a toy replay buffer from DDPG on Pendulum-v0 with only 100 experiences
[ ] Will have to find a simple way to make the actor take in both state and action.

sampreet-arthi commented 3 years ago

Buffers have been tested but not after the addition of BCQ so tests are failing rn

lgtm-com[bot] commented 3 years ago

This pull request introduces 3 alerts when merging 6c271efd645828be529cbc43d2ddad248b199d86 into b8a45ab7fd058d120acd058ddead1db77c9bb616 - view on LGTM.com

new alerts:

1 for Unused local variable
1 for Unused import
1 for Wrong number of arguments in a call

codecov[bot] commented 3 years ago

Codecov Report

Merging #378 into master will decrease coverage by 2.76%. The diff coverage is 58.76%.

@@            Coverage Diff             @@
##           master     #378      +/-   ##
==========================================
- Coverage   91.28%   88.51%   -2.77%     
==========================================
  Files          90       93       +3     
  Lines        3809     3944     +135     
==========================================
+ Hits         3477     3491      +14     
- Misses        332      453     +121

Impacted Files	Coverage Δ
genrl/agents/deep/base/base.py	`93.75% <ø> (ø)`
genrl/agents/deep/base/onpolicy.py	`96.15% <ø> (ø)`
genrl/trainers/onpolicy.py	`92.00% <ø> (ø)`
genrl/agents/offline/bcq/bcq.py	`23.86% <23.86%> (ø)`
genrl/trainers/offline.py	`27.77% <27.77%> (ø)`
genrl/core/models.py	`33.33% <33.33%> (ø)`
genrl/trainers/base.py	`81.30% <47.05%> (-6.87%)`	:arrow_down:
genrl/core/buffers.py	`92.94% <91.80%> (-2.30%)`	:arrow_down:
genrl/core/rollouts.py	`96.77% <96.77%> (ø)`
genrl/agents/__init__.py	`100.00% <100.00%> (ø)`
... and 13 more

lgtm-com[bot] commented 3 years ago

This pull request introduces 4 alerts when merging b28c1e650ec4bc19daf7b7c04691a9d4e4a5563c into a2c8c7e137167219ea262db5b56c3197a86e05b0 - view on LGTM.com

new alerts:

3 for Unused import
1 for Signature mismatch in overriding method

lgtm-com[bot] commented 3 years ago

This pull request introduces 4 alerts when merging 3db4733211352506fa5c339a7b40f738a994aa44 into 25eb018f18a9a1d0865c16e5233a2a7ccddbfd78 - view on LGTM.com

new alerts:

3 for Unused import
1 for Signature mismatch in overriding method

lgtm-com[bot] commented 3 years ago

This pull request introduces 4 alerts when merging 43a483ee54b3bd86b7bfd1249115eb76cde9b942 into 25eb018f18a9a1d0865c16e5233a2a7ccddbfd78 - view on LGTM.com

new alerts:

3 for Unused import
1 for Signature mismatch in overriding method