Closed zcchenvy closed 5 years ago
According to the paper of ddpg on page 5, the actor needs to be updated by just plain q_gradient_batch
. It should be added to the actor parameters.
Bust since the actor train function (see here) from TensorFlow, by default, applying a gradient descent, meaning any gradient given to the optimizer is subtracted. That's why I flipped the gradient so that the end effect is still an adding.
According to the paper of ddpg on page 5, the actor needs to be updated by just plain
q_gradient_batch
. It should be added to the actor parameters.Bust since the actor train function (see here) from TensorFlow, by default, applying a gradient descent, meaning any gradient given to the optimizer is subtracted. That's why I flipped the gradient so that the end effect is still an adding.
thank you
line 103: Why change the direction of the gradient?I think this step is not required. I do not really understand your meaning.