Closed random-user-x closed 6 years ago
Hello @Kaixhin, https://github.com/Kaixhin/ACER/blob/f22b07cebd9ec278c5b604b2652e6657df4b61ab/train.py#L97 I think that we should freeze the value of z_star_p by using z_star_p.detach().
z_star_p
z_star_p.detach()
In the second stage, we take advantage of back-propagation. Specifically, the updated gradient with respect to φθ, that is z_∗, is back-propagated through the network to compute the derivatives with respect to the parameters.
Please let me know what do you think.
Ah yes the gradients are probably leaking through z_star_p, I think you are right on this one.
Hello @Kaixhin, https://github.com/Kaixhin/ACER/blob/f22b07cebd9ec278c5b604b2652e6657df4b61ab/train.py#L97 I think that we should freeze the value of
z_star_p
by usingz_star_p.detach()
.Please let me know what do you think.