hunkim / ReinforcementZeroToAll

249 stars 132 forks source link

add: basic actor-critic network (A2C) #11

Closed kkweon closed 7 years ago

kkweon commented 7 years ago

Summary

  1. This is a basic actor-critic network
  2. It has a few flaws
    • These are noted in the notebook
    • Later advanced techniques such as TRPO or A3C will be introduced to overcome this problem
  3. This agent in action: Open AI Link
  4. To preview this Notebook file, click here
hunkim commented 7 years ago

@kkweon Do you know how to read code in the jupiter notebook before merge?

hunkim commented 7 years ago

Wonderful code. Can we use tf.layers rather than slim?