Add support for lecun normal weight initialization

FluxML / Flux.jl

Relax! Flux is the ML library that doesn't make you tensor

https://fluxml.ai/

Other

4.46k stars 603 forks source link

Add support for lecun normal weight initialization #2290

Open p-w-rs opened 1 year ago

p-w-rs commented 1 year ago

Motivation and description

Lecun normal initialization is needed (as far as I understand) to properly make self normalized neural networks.

Since Flux provides the selu activation function and alpha dropout, it would be nice to have lecun normal built in as well.

Possible Implementation

Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. (That is from tensorflow website)

darsnack commented 1 year ago

Could probably use the existing truncated_normal with the standard deviation set according to the fan in.

vortex73 commented 1 year ago

would be interested in solving this if its still open. Please elaborate if it is.

darsnack commented 1 year ago

In src/utils.jl we already have a Flux.truncated_normal initializer function that accepts a custom standard deviation. A PR would add a new intializer, Flux.lecun_normal, that just calls the existing truncated_normal with a standard deviation calculated as mentioned above using Flux.nfan in the same src/utils.jl file.

chiral-carbon commented 1 year ago

Hi @vortex73, would you be working on this?

darsnack commented 1 year ago

@chiral-carbon Please go ahead and open a PR if you are willing to tackle this.

chiral-carbon commented 1 year ago

@darsnack thanks, will open a PR soon

chiral-carbon commented 1 year ago

@RohitRathore1 were you working on this? I had a PR in the works but will stop @darsnack

Bhavay-2001 commented 9 months ago

Hi @chiral-carbon, @darsnack. Is this issue empty? Can I start?

chiral-carbon commented 9 months ago

@Bhavay-2001 i had claimed this but then a PR was opened soon after by someone else, so I’m not sure about the status. If this issue opens up again for a new PR I would like to work on it

RohitRathore1 commented 7 months ago

Hi @chiral-carbon, I don't know. I have opened a PR but I have not got any comment on it and now I am checking the logs of one GitHub actions then logs are not available. I will have to review it again.

ToucheSir commented 7 months ago

I don't recall why that PR didn't get comments. Maybe someone was waiting for tests to be in place? Anyhow, that would be my feedback now. We can continue on the PR thread :)