issues
search
pat-coady
/
trpo
Trust Region Policy Optimization with TensorFlow and OpenAI Gym
https://learningai.io/projects/2017/07/28/ai-gym-workout.html
MIT License
360
stars
106
forks
source link
Temporal difference error in value estimates not calculated.
#26
Closed
ghost
closed
5 years ago