openai / weightnorm

Example code for Weight Normalization, from "Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks"
https://arxiv.org/abs/1602.07868
MIT License
365 stars 118 forks source link

Hope release a weightnorm code that can be used for keras using theano backend #1

Open jf003320018 opened 7 years ago

jf003320018 commented 7 years ago

It seems that the weight norm code for keras uses tensorflow backend. I try to modify it as a theano backend based code, but failed. So hope to release a weightnorm code that can be used for keras using theano backend. Thank you.

redst4r commented 7 years ago

just changing the existing tf.XYZ() calls into the corresponding keras.backend. calls doesnt work?

jf003320018 commented 7 years ago

Do you try it? I have change the tf.XXX function to K.XXX function in theano backend by looking their meanings, but tf.get_variable_shape() and K.get_variable_shape() using theano backend is different. So It pop up an error.

engharat commented 7 years ago

I'm trying to modify the code for theano backend as you, but I encounter several obstacles. jf003320018 did you manage to convert the code? It is not ( or not only) a matter of different function meanings, as keras implement a set of function both for tensorflow and for theano. I think the problem stays in the function: def get_weightnorm_params_and_grads(p, g): The first fix to be done is: V_scaler_shape = (ps[-1],) to V_scaler_shape = (ps[0],) as there is the problem of tensor ordering convention. Still, the fix isn't enough because of some tesnor manipulation in the rest of the function. I tried putting [0] instead of [-1] in the other lines of code, but it is not working and I'm not understanding into detail how these lines of code are working.