Closed pxlong closed 7 years ago
it's just a nuance of the tf.gradients
API which always returns a list, even when you're only asking for a single value...
from the docs
gradients(ys, xs) adds ops to the graph to output the partial derivatives of ys with respect to xs. It returns a list of Tensor of length len(xs) where each tensor is the sum(dy/dx) for y in ys.
@matpalm but I suppose we do need a list as the self.input_action (and self.q_value) is a batch of actions sampled from replay buffer. Why do we only ask for a single value? Sorry for bothering.
i think you're confusing two things...
xs
is a single element.i'm calling tf.gradients
with one xs value, input_action
, so the return value is a list of one tensor.
the [0]
in my code is referencing the first element of this returned list, it has nothing to do with the shape of the returned tensor.
have a look at the doc again.
Hi,
I just read through your DDPG implementation, and it looks awesome. Thanks for sharing!
Currently, I feel confusion about the q_gradients_wrt_actions function, why we add [0] after the returned gradients since we use a batch of actions to compute gradients.
Thank you so much.