Closed thiagopbueno closed 5 years ago
Build ops for implementing a step of gradient descent over the Policy parameters given a TensorFlow optimizer and a batch of experiences.
Closed after #60
Build ops for implementing a step of gradient descent over the Policy parameters given a TensorFlow optimizer and a batch of experiences.