Batch update for Continuous Mountain Car Actor-Critic

dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

MIT License

20.56k stars 6.04k forks source link

In https://github.com/dennybritz/reinforcement-learning/blob/master/PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution.ipynb, I found every time step, the actor and value function are updated

# Update the value estimator
estimator_value.update(state, td_target)

# Update the policy estimator
# using the td error as our advantage estimate
estimator_policy.update(state, td_error, action)

How can I batch update the actor and value function since the overhead of calling tf's session is not small when the network is large.

dennybritz / reinforcement-learning

Batch update for Continuous Mountain Car Actor-Critic #180