mila-iqia / blocks-extras

A collection of extensions to the Blocks framework
MIT License
27 stars 40 forks source link

Added normalized initialization. #12

Closed Thrandis closed 9 years ago

Thrandis commented 9 years ago

Got a bug on my own computer, so I'm checking if the problem is on my side.

Thrandis commented 9 years ago

@dmitriy-serdyuk do you want something else for this ccw?

Thrandis commented 9 years ago

@dmitriy-serdyuk like this?

dmitriy-serdyuk commented 9 years ago

Yes, looks nice. Can you add, that it can be applied only to 2D weights and won't work with the convolution, for example?

dwf commented 9 years ago

Also, IIRC this is meant for a particular type of activation function, and there is a separate derivation to be made for others?

Thrandis commented 9 years ago

@dmitriy-serdyuk all right! @dwf ahhh yeah you're right as well! I think it was for tanh, but I will re-read the paper quickly!

Thrandis commented 9 years ago

@dwf For ReLUs, there is the derivation for the convolution layers in this paper: http://arxiv.org/pdf/1502.01852v1.pdf I don't know how much it is used in parctice! What do you think?

dwf commented 9 years ago

It's very new, but it'd be nice to have -- no need to do it in this pull request though.

I think there's a derivation floating around for logistic units as well.

Thrandis commented 9 years ago

@dmitriy-serdyuk I think #14 must be merged before mine, in order to fix the Scrutinizer.

dwf commented 9 years ago

It's in!

Thrandis commented 9 years ago

@dwf great, I'll update my code then!