hpi-sam / rl-4-self-repair

Reinforcement Learning Models for Online Learning of Self-Repair and Self-Optimization
MIT License
0 stars 1 forks source link

Implement Policy Gradient method with Eligibility Traces #14

Open christianadriano opened 4 years ago

christianadriano commented 4 years ago

Page 333 of Sutton and Barto book. Actor-Critic with Eligility Traces (Continuing)

Similar code here: https://github.com/aditya1702/Machine-Learning-and-Data-Science/blob/master/Implementation%20of%20Reinforcement%20Learning%20Algorithms/Tensorflow%20Implementations/Policy%20Gradients/Actor-Critic%20Method%20-%20CliffWalking%20Env.ipynb