hunkim / ReinforcementZeroToAll

249 stars 132 forks source link

Closed kkweon closed 7 years ago

kkweon commented 7 years ago

Summary

This is a basic actor-critic network
It has a few flaws
- These are noted in the notebook
- Later advanced techniques such as TRPO or A3C will be introduced to overcome this problem
This agent in action: Open AI Link
To preview this Notebook file, click here

hunkim commented 7 years ago

@kkweon Do you know how to read code in the jupiter notebook before merge?

hunkim commented 7 years ago

Wonderful code. Can we use tf.layers rather than slim?