Vanilla REINFORCE implementation

dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

http://www.wildml.com/2016/10/learning-reinforcement-learning/

MIT License

20.61k stars 6.04k forks source link

Vanilla REINFORCE implementation #200

Open alek5k opened 5 years ago

alek5k commented 5 years ago

Hello,

Is there any benefit to having a vanilla REINFORCE algorithm for people trying to learn the concepts? REINFORCE with Baseline includes a value function approximator which has a lot of similarities to the Actor Critic.

I think being able to see a pure policy gradient method could be useful as a learning tool, otherwise people may assume Policy Gradient methods have to have some kind of value function approximation too.

makaveli10 commented 4 years ago

Look at this if you want to see the high variance results of Vanilla reinforce

vieveks commented 1 year ago

Can I implement the vanilla REINFORCE ?