Force BatchNorm to use batch mean and std during "Predict"

ValerioB88 commented 6 years ago

According to Isola et al., 2017, and some other sources, batchNorm for GAN should just use the mean and std of the batch, even at test time (where is common practice to use the moving mean/average of the training instead). In my case, this means to force batchNorm to use the batch mean/std during predict.

I have investigated this quite a bit now, and I am pretty confused about how to achieve that. I am using keras 2.1.4 with tensorflow.

After some investigation I saw that this should be achieved with K.set_learning_phase(1). This does not result in the expected behaviour for the predict function, and actually does result in the expected behaviour in several other situation!

For example, let's observe this simple snippet:

import keras
from keras.layers.normalization import BatchNormalization
from keras.models import Sequential
import numpy as np
from keras import backend as K

print("Version: ", keras.__version__)
np.random.seed(1);

model = Sequential()
model.add(BatchNormalization(input_shape=(2,)))
model.compile(loss='mse', optimizer='adam')

X = np.random.normal(size=(5, 2))
print("Prediction before training: ", model.predict(X))
print("Weights before training: ", [[list(w) for w in l.get_weights()] for l in model.layers])

Y = np.random.normal(size=(5, 2))

model.fit(X, Y, verbose=0, epochs=10)
#K.set_learning_phase(1)

print("\n\nPrediction 1: ", model.predict(X))
print("Weights after training: ", [[list(w) for w in l.get_weights()] for l in model.layers])

Now, by commenting in and out the K.set_learning_phase, I would expected the value of predict to change. In fact, they do not.

Investigating this I got even stranger result. K.set_learning_phase affects the learnt weight in a weird way. If you put set_learning_phase before fit, then it will return different weights values (compared to not putting it there or putting it after the fit), and this happens regardless of its value (0 or 1)! This also means that putting set_learning_phase before fit will change the predict value, but that's just because the learnt weights are different, not because is computing predict differently.

What is going on in here??

Thanks

rodrigo2019 commented 6 years ago

hi @ValerioBiscione, I think you are asking in the wrong place, this repository is a implementation of yolov2 using keras. the keras devs are in this repo

ValerioB88 commented 6 years ago

Thank you, apologies for the mistake :)

experiencor / keras-yolo2

Force BatchNorm to use batch mean and std during "Predict" #303