Closed nlml closed 7 years ago
Good to hear that you found it useful! As you can find mentioned in page 6 of the paper - Semi-Supervised Learning with Ladder Networks, the beta and gamma parameters of batch normalization are redundant in case of some activations functions. In case of the ReLU activation function, the scaling parameter gamma is redundant.
Ah okay, make sense. Thanks!
On Mon, 27 Mar 2017 at 20:14 Rinu Boney notifications@github.com wrote:
Good to hear that you found it useful! As you can find mentioned in page 6 of the paper - Semi-Supervised Learning with Ladder Networks https://arxiv.org/abs/1507.02672, the beta and gamma parameters of batch normalization are redundant in case of some activations functions. In case of the ReLU activation function, the scaling parameter gamma is redundant.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rinuboney/ladder/issues/13#issuecomment-289537896, or mute the thread https://github.com/notifications/unsubscribe-auth/AMOFkV-X6_iSxSjKTzy-QtqbwytslMd_ks5rp_x2gaJpZM4MqJLJ .
Hi there,
First of all thanks for this implementation, I have been making good use of it in my efforts to study the ladder network.
One thing I noticed, in line 130 of the current ladder.py, you have:
h = tf.nn.relu(z + weights["beta"][l-1])
shouldn't this be:
h = tf.nn.relu(weights['gamma'][l-1] * (z + weights["beta"][l-1]))
? (As per equation 10 here)Regards, Liam