seungeunrho / minimalRL

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)
MIT License
2.83k stars 457 forks source link

Add new algorithms #11

Open rahulptel opened 5 years ago

rahulptel commented 5 years ago

It would be nice to add the following algorithms:

I will submit a PR if I finish any of them.

seungeunrho commented 5 years ago

Hi! I think A2C (synchronous update version of A3C) is good. What about implementing RAINBOW rather than Double, Dueling DQN? I think the significance of the code to both Double and Dueling DQN is marginal because they are small variations of DQN in terms of implementation. In contrast, a simple implementation of the RAINBOW might be helpful for many people. (Actually, Dueling and Double DQN are 2 components of RAINBOW out of 6) https://arxiv.org/abs/1710.02298

rahulptel commented 5 years ago

Agreed. We can go with RAINBOW.

seungeunrho commented 5 years ago

Awesome!

BDEvan5 commented 4 years ago

MuZero would also be a cool algorithm, it is a bit more complicated with the MCTS but it works very well

BDEvan5 commented 4 years ago

Also, thanks so much for sharing. These are great simple implementations for learning and have been very useful.

If you want to try something else, you could also try to implement them in TensorFlow

ADGEfficiency commented 4 years ago

How about SAC?

Mahesha999 commented 3 years ago

How about Phasic Policy Gradient (PPG) as it gives better results than PPO? Also an example of using these algorithms for non gaming environment like ones with list, dict etc as observation instead of image frames. I guess that will be easy as we will have to use NN instead of CNN. Still a simple example, may be.