dennybritz / reinforcement-learning

Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.
http://www.wildml.com/2016/10/learning-reinforcement-learning/
MIT License
20.39k stars 6.02k forks source link

Batch update for Continuous Mountain Car Actor-Critic #180

Open GoingMyWay opened 5 years ago

GoingMyWay commented 5 years ago

In https://github.com/dennybritz/reinforcement-learning/blob/master/PolicyGradient/Continuous%20MountainCar%20Actor%20Critic%20Solution.ipynb, I found every time step, the actor and value function are updated

# Update the value estimator
estimator_value.update(state, td_target)

# Update the policy estimator
# using the td error as our advantage estimate
estimator_policy.update(state, td_error, action)

How can I batch update the actor and value function since the overhead of calling tf's session is not small when the network is large.

sharlec commented 2 years ago

I have the same question after 4 years. Did you find the answer?