Closed Thrandis closed 9 years ago
@dmitriy-serdyuk do you want something else for this ccw?
@dmitriy-serdyuk like this?
Yes, looks nice. Can you add, that it can be applied only to 2D weights and won't work with the convolution, for example?
Also, IIRC this is meant for a particular type of activation function, and there is a separate derivation to be made for others?
@dmitriy-serdyuk all right! @dwf ahhh yeah you're right as well! I think it was for tanh, but I will re-read the paper quickly!
@dwf For ReLUs, there is the derivation for the convolution layers in this paper: http://arxiv.org/pdf/1502.01852v1.pdf I don't know how much it is used in parctice! What do you think?
It's very new, but it'd be nice to have -- no need to do it in this pull request though.
I think there's a derivation floating around for logistic units as well.
@dmitriy-serdyuk I think #14 must be merged before mine, in order to fix the Scrutinizer.
It's in!
@dwf great, I'll update my code then!
Got a bug on my own computer, so I'm checking if the problem is on my side.