Open speedcell4 opened 6 years ago
Could you give an example of what you would like the API for this to look like?
On Mar 3, 2018 10:36 AM, "Izen" notifications@github.com wrote:
To implement the reinforcement learning algorithms like A3C, directly setting the gradient of Parameters and LookupParameters will be necessary. e.g. Parameter.set_grad(array)
Further, users will be able to implement their own operations without the need to touch C++ source code if the gradient of Expression can also be reassigned. Just like what Chainer can do.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/clab/dynet/issues/1271, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYWG2EkwJvIc8xkzqCCiYQ1CRqsue1eks5tarh_gaJpZM4Sa9i3 .
Sorry, to be a little bit more clear, would you expect to do this directly before calling the update function of the trainer? I think that seems like something we could do.
On Mar 3, 2018 11:12 AM, "Graham Neubig" neubig@gmail.com wrote:
Could you give an example of what you would like the API for this to look like?
On Mar 3, 2018 10:36 AM, "Izen" notifications@github.com wrote:
To implement the reinforcement learning algorithms like A3C, directly setting the gradient of Parameters and LookupParameters will be necessary. e.g. Parameter.set_grad(array)
Further, users will be able to implement their own operations without the need to touch C++ source code if the gradient of Expression can also be reassigned. Just like what Chainer can do.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/clab/dynet/issues/1271, or mute the thread https://github.com/notifications/unsubscribe-auth/AAYWG2EkwJvIc8xkzqCCiYQ1CRqsue1eks5tarh_gaJpZM4Sa9i3 .
Yes, that is what I want. One example here.
pc = dy.Model()
W = pc.add_parameters((1, 4))
# get a copy of W
W1 = pc.add_parameters((1, 4))
W1.set_value(W.as_array())
# get another copy
W2 = pc.add_parameters((1, 4))
W2.set_value(W.as_array())
dy.renew_cg()
batch1 = # prepare some data
batch2 = # prepare some other data
loss1 = compute_loss(W1, batch1)
loss2 = compute_loss(W2, batch2)
loss1.backward()
loss2.backward()
# set the averaged gradients to W, the set_grad function is what I want
W.set_grad((W1.grad_as_array() + W2.grad_as_array()) / 2)
W1.scale_gradient(0) # clear W1.grad
W2.scale_gradient(0) # clear W2.grad
# update only W by the averaged gradients
trainer.update()
You may say this can be achieved by ((loss1 + loss2) / 2).backward()
, yes, but let's imagine this is needed in some complex situations, e.g. asynchronize workers.
Thanks, we'll mark this as an enhancement.
To implement the reinforcement learning algorithms like A3C, directly setting the gradient of
Parameters
andLookupParameters
will be necessary. e.g.Parameters.set_grad(self, array)
Further, users will be able to implement their own operations without the need to touch C++ source code if the gradient of
Expression
can also be reassigned. Just like what Chainer can do.Would you please bring in these things?