Closed zaidalyafeai closed 4 years ago
The same problem happens in TensorFlow as well.
@zaidalyafeai you mean tf.keras?
No, this definition
https://www.tensorflow.org/api_docs/python/tf/layers/batch_normalization
@caisq This is a quote from the TensorFlow page
training: Either a Python boolean, or a TensorFlow boolean scalar tensor (e.g. a placeholder). Whether to return the output in training mode (normalized with statistics of the current batch) or in inference mode (normalized with moving statistics). NOTE: make sure to set this parameter correctly, or else your training/inference will not work properly.
@caisq, I am trying to understand the source code. Could you please explain to me what is broadcasting ?
@caisq did we ever resolve this issue? I assume tf.layers is doing the right thing in TensorFlow..
I resolved this issue by modifying the source code and changing the definition of batch norm during inference time. My pix2pix demo is based on that!
Training with BatchNormazliation should be working. See the ACGAN example under review at https://github.com/tensorflow/tfjs-examples/pull/187
I'd like to see the code you're using and the change you made in order for it to work, @zaidalyafeai , if possible.
@caisq, I may have accidentally deleted the source code :/ but the idea is simple I just forced batch norm layer to use the statistics of the input sample as if it was training. So, I didn't add any code just re-routing.
Closing this due to lack of activity, feel to reopen. Thank you
Small value of BatchNormalization parameter
"momentum" = 0.01
may help this (works for 1D in TensorFlow js)
To get help from the community, check out our Google group.
TensorFlow.js version
latest
Browser version
Version 66.0.3359.139
Describe the problem or feature request
Batchnorm has wrong predictions when setting training = 1
Code to reproduce the bug / link to feature request
I created this simple keras model
After training, the batch norm layer weights are
After running the prediction
model.predict(np.zeros((1, 2, 2, 3)))
The outputOn the browser the weights are the same but the activations are
Explanation
on keras when setting training = 1, it uses the statics of the prediction sample
Tensorflow.js uses the stored moving mean and variance of the training data