rinuboney / ladder

Ladder network is a deep learning algorithm that combines supervised and unsupervised learning.
MIT License
242 stars 92 forks source link

Intermediate layers do not make use of gamma parameters #13

Closed nlml closed 7 years ago

nlml commented 7 years ago

Hi there,

First of all thanks for this implementation, I have been making good use of it in my efforts to study the ladder network.

One thing I noticed, in line 130 of the current ladder.py, you have: h = tf.nn.relu(z + weights["beta"][l-1])

shouldn't this be: h = tf.nn.relu(weights['gamma'][l-1] * (z + weights["beta"][l-1])) ? (As per equation 10 here)

Regards, Liam

rinuboney commented 7 years ago

Good to hear that you found it useful! As you can find mentioned in page 6 of the paper - Semi-Supervised Learning with Ladder Networks, the beta and gamma parameters of batch normalization are redundant in case of some activations functions. In case of the ReLU activation function, the scaling parameter gamma is redundant.

nlml commented 7 years ago

Ah okay, make sense. Thanks!

On Mon, 27 Mar 2017 at 20:14 Rinu Boney notifications@github.com wrote:

Good to hear that you found it useful! As you can find mentioned in page 6 of the paper - Semi-Supervised Learning with Ladder Networks https://arxiv.org/abs/1507.02672, the beta and gamma parameters of batch normalization are redundant in case of some activations functions. In case of the ReLU activation function, the scaling parameter gamma is redundant.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rinuboney/ladder/issues/13#issuecomment-289537896, or mute the thread https://github.com/notifications/unsubscribe-auth/AMOFkV-X6_iSxSjKTzy-QtqbwytslMd_ks5rp_x2gaJpZM4MqJLJ .