Open p-w-rs opened 1 year ago
Could probably use the existing truncated_normal
with the standard deviation set according to the fan in.
would be interested in solving this if its still open. Please elaborate if it is.
In src/utils.jl
we already have a Flux.truncated_normal
initializer function that accepts a custom standard deviation. A PR would add a new intializer, Flux.lecun_normal
, that just calls the existing truncated_normal
with a standard deviation calculated as mentioned above using Flux.nfan
in the same src/utils.jl
file.
Hi @vortex73, would you be working on this?
@chiral-carbon Please go ahead and open a PR if you are willing to tackle this.
@darsnack thanks, will open a PR soon
@RohitRathore1 were you working on this? I had a PR in the works but will stop @darsnack
Hi @chiral-carbon, @darsnack. Is this issue empty? Can I start?
@Bhavay-2001 i had claimed this but then a PR was opened soon after by someone else, so I’m not sure about the status. If this issue opens up again for a new PR I would like to work on it
Hi @chiral-carbon, I don't know. I have opened a PR but I have not got any comment on it and now I am checking the logs of one GitHub actions then logs are not available. I will have to review it again.
I don't recall why that PR didn't get comments. Maybe someone was waiting for tests to be in place? Anyhow, that would be my feedback now. We can continue on the PR thread :)
Motivation and description
Lecun normal initialization is needed (as far as I understand) to properly make self normalized neural networks.
Since Flux provides the selu activation function and alpha dropout, it would be nice to have lecun normal built in as well.
Possible Implementation
Draws samples from a truncated normal distribution centered on 0 with stddev = sqrt(1 / fan_in) where fan_in is the number of input units in the weight tensor. (That is from tensorflow website)