This repository contains codes that I have reproduced (while learning RL) for various reinforcement learning algorithms. The codes were tested on Colab.
If Github is not loading the Jupyter notebooks, a known Github issue, click here to view the notebooks on Jupyter's nbviewer.
Algorithms | Discrete | Continuous | Multithreaded | Multiprocessing | Tested on |
---|---|---|---|---|---|
DQN | :heavy_check_mark: | CartPole-v0 | |||
Double DQN (DDQN) | :heavy_check_mark: | CartPole-v0 | |||
Dueling DDQN | :heavy_check_mark: | CartPole-v0 | |||
Dueling DDQN + PER | :heavy_check_mark: | CartPole-v0 | |||
A3C (1) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:(3) | CartPole-v0, Pendulum-v0 |
DPPO (2) | :heavy_check_mark: | :heavy_check_mark:(3) | Pendulum-v0 | ||
RND + PPO | :heavy_check_mark: | MountainCarContinuous-v0 (4), Pendulum-v0 (5) |
(1): N-step returns used for critic's target.
(2): GAE used for computation of TD lambda return (for critic's target) & policy's advantage.
(3): Distributed Tensorflow & Python's multiprocessing package used.
(4): State featurization (approximates feature map of an RBF kernel) is used.
(5): Fast-slow LSTM with an overly simplified VAE like "variational unit" (VU) is used.
The misc folder contains related example codes that I have put together while learning RL. See the README.md in the misc folder for more details.
Check out my blog for more information on my repositories.