Closed franciscovargas closed 6 years ago
potential work around
import keras.backend as K
# for some model with dropout ...
f = K.function([model.layers[0].input, K.learning_phase()],
[model.layers[-1].output])
def predict_with_uncertainty(f, x, no_classes, n_iter=100):
result = np.zeros((n_iter,) + (x.shape[0], no_classes) )
for i in range(n_iter):
result[i,:, :] = f((x, 1))[0]
prediction = result.mean(axis=0)
uncertainty = result.std(axis=0)
return prediction, uncertainty
@franciscovargas that work around seems to be correct since it was used by Gal in the implementation for the experiments of the paper Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. See the implementation here.
Still would be nice to have this build into Keras so that it works nicely with the model predict functions.
Thanks, I wish I had seen that earlier on today :D ...
There is this feature in Keras: it's the training
argument in the call of the Dropout
layer.
Here's a model with a Dense
layer and a Dropout
layer that runs both in training and testing:
import keras
inputs = keras.Input(shape=(10,))
x = keras.layers.Dense(3)(inputs)
outputs = keras.layers.Dropout(0.5)(x, training=True)
model = keras.Model(inputs, outputs)
Maybe worth adding to the docs and saving more questions asked in the future since I can't see it in core layers for dropout. No such param is mentioned. It was not immediately clear for me when reading the source that the training flag was for this.
In the implementation with the Training = True parameter in layer dropout, are the values scale in the training phase? Are the values scale in the prediction phase? I am not sure about what the parameter Training=True is doing.
@franciscovargas Your method works for me but it seems to cause a memory leak. #10338
There is this feature in Keras: it's the
training
argument in the call of theDropout
layer.Here's a model with a
Dense
layer and aDropout
layer that runs both in training and testing:import keras inputs = keras.Input(shape=(10,)) x = keras.layers.Dense(3)(inputs) outputs = keras.layers.Dropout(0.5)(x, training=True) model = keras.Model(inputs, outputs)
when I use lstm(recurrent_dropout=0.5), and I want keep the recurrent_dropout in test phase. is the following code right?
import keras
inputs = keras.Input(shape=(10,)) x = keras.layers.LSTM(10,recurrent_dropout=0.5)(inputs, training=True) x = keras.layers.Dense(3)(x) outputs = keras.layers.Dropout(0.5)(x, training=True)
model = keras.Model(inputs, outputs)
@fchollet thanks a lot !!! works like a charm
Does the training=True
option work with LSTM
layers with recurrent_dropout
as well?
This doesn't seem to work with SpatialDropout laters, any suggestions?
Great thread, but how can I use training=true
in the Sequential API? for example
model = Sequential()
model.add(LSTM(...))
Model.add(Dropout(0.2))
...
is this documented anywhere?
Great thread, but how can I use
training=true
in the Sequential API? for examplemodel = Sequential() model.add(LSTM(...)) Model.add(Dropout(0.2)) ...
is this documented anywhere?
I've just stumbled accross the same problem. The general question is how to override keras call-methods to toggle between call-methodology and the classical Sequential-API. My hacky quickfix was to inherit from the keras.layers.Dropout class and overwrite its call-method. In additon I added the kwarg training=True to the __init__-method before calling super with the arguments expected by the base-class.
class Dropout(keras.layers.Dropout):
"""Applies Dropout to the input.
Dropout consists in randomly setting
a fraction `rate` of input units to 0 at each update during training time,
which helps prevent overfitting.
# Arguments
rate: float between 0 and 1. Fraction of the input units to drop.
noise_shape: 1D integer tensor representing the shape of the
binary dropout mask that will be multiplied with the input.
For instance, if your inputs have shape
`(batch_size, timesteps, features)` and
you want the dropout mask to be the same for all timesteps,
you can use `noise_shape=(batch_size, 1, features)`.
seed: A Python integer to use as random seed.
# References
- [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](
http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf)
"""
def __init__(self, rate, training=None, noise_shape=None, seed=None, **kwargs):
super(Dropout, self).__init__(rate, noise_shape=None, seed=None,**kwargs)
self.training = training
def call(self, inputs, training=None):
if 0. < self.rate < 1.:
noise_shape = self._get_noise_shape(inputs)
def dropped_inputs():
return K.dropout(inputs, self.rate, noise_shape,
seed=self.seed)
if not training:
return K.in_train_phase(dropped_inputs, inputs, training=self.training)
return K.in_train_phase(dropped_inputs, inputs, training=training)
return inputs
Now you can just pass the argument when adding layers via the Sequential API, such as:
model.add(keras.layers.Dense(512, activation="relu"))
model.add(Dropout(rate=0.5, training=True))
model.add(keras.layers.Dense(256, activation="relu"))
model.add(Dropout(rate=0.5, training=True))
model.add(keras.layers.Dense(2, activation="softmax"))
There is this feature in Keras: it's the
training
argument in the call of theDropout
layer.Here's a model with a
Dense
layer and aDropout
layer that runs both in training and testing:import keras inputs = keras.Input(shape=(10,)) x = keras.layers.Dense(3)(inputs) outputs = keras.layers.Dropout(0.5)(x, training=True) model = keras.Model(inputs, outputs)
Can you also switch back to the non-dropout prediction after compiling? Or is it compiled in and do you need to make a separate model and transfer the weights?
@franciscovargas thanks for the workaround.
One question I have is if Keras rescale the weights during test phase when dropout is 'enabled'. Theoretically the average you obtain from the MC dropout should be similar with the prediction you get when you use all the connections for the same input. However, in my case the output from MC dropout is always much smaller than the prediction with out dropout.
There is this feature in Keras: it's the
training
argument in the call of theDropout
layer.Here's a model with a
Dense
layer and aDropout
layer that runs both in training and testing:import keras inputs = keras.Input(shape=(10,)) x = keras.layers.Dense(3)(inputs) outputs = keras.layers.Dropout(0.5)(x, training=True) model = keras.Model(inputs, outputs)
@fchollet If I use training=True
to enable the Dropout, is it possible to turn it off in the testing phase when necessary?
potential work around
import keras.backend as K # for some model with dropout ... f = K.function([model.layers[0].input, K.learning_phase()], [model.layers[-1].output]) def predict_with_uncertainty(f, x, no_classes, n_iter=100): result = np.zeros((n_iter,) + (x.shape[0], no_classes) ) for i in range(n_iter): result[i,:, :] = f((x, 1))[0] prediction = result.mean(axis=0) uncertainty = result.std(axis=0) return prediction, uncertainty
The workaround fails (error in defining K.function) due to the issue mentioned in https://github.com/tensorflow/tensorflow/issues/34201
@MalteEbner : See my suggestion here: https://github.com/tensorflow/tensorflow/issues/34201#issuecomment-577596280
Has anything changed in tf now? I am getting the same predictions with the suggested snippet.
potential work around
import keras.backend as K # for some model with dropout ... f = K.function([model.layers[0].input, K.learning_phase()], [model.layers[-1].output]) def predict_with_uncertainty(f, x, no_classes, n_iter=100): result = np.zeros((n_iter,) + (x.shape[0], no_classes) ) for i in range(n_iter): result[i,:, :] = f((x, 1))[0] prediction = result.mean(axis=0) uncertainty = result.std(axis=0) return prediction, uncertainty
The workaround fails (error in defining K.function) due to the issue mentioned in tensorflow/tensorflow#34201
@gieses I was wondering too. Uncertainty is always zero
There is this feature in Keras: it's the
training
argument in the call of theDropout
layer. Here's a model with aDense
layer and aDropout
layer that runs both in training and testing:import keras inputs = keras.Input(shape=(10,)) x = keras.layers.Dense(3)(inputs) outputs = keras.layers.Dropout(0.5)(x, training=True) model = keras.Model(inputs, outputs)
when I use lstm(recurrent_dropout=0.5), and I want keep the recurrent_dropout in test phase. is the following code right?
import keras
inputs = keras.Input(shape=(10,)) x = keras.layers.LSTM(10,recurrent_dropout=0.5)(inputs, training=True) x = keras.layers.Dense(3)(x) outputs = keras.layers.Dropout(0.5)(x, training=True)
model = keras.Model(inputs, outputs)
Did you figure it out?
http://www.cs.ox.ac.uk/people/yarin.gal/website/blog_2248.html
As mentioned in this blog written by the inventor of MC dropout, fixing the dropped weights for all test inputs make better visualization.
Does anyone have a solution for fixing the dropout weights using the keras dropout?
Old thread, but another solution is a LayerWrapper. This turned out useful in my case
class AlwaysInTrain(tf.keras.layers.Wrapper):
def __call__(self, inputs, *args, **kwargs):
return self.layer(inputs, *args, **kwargs, training=True)
# use as followed
x = AlwaysInTrain(tf.keras.layers.Dropout(0.5))(x)
potential work around
import keras.backend as K # for some model with dropout ... f = K.function([model.layers[0].input, K.learning_phase()], [model.layers[-1].output]) def predict_with_uncertainty(f, x, no_classes, n_iter=100): result = np.zeros((n_iter,) + (x.shape[0], no_classes) ) for i in range(n_iter): result[i,:, :] = f((x, 1))[0] prediction = result.mean(axis=0) uncertainty = result.std(axis=0) return prediction, uncertainty
I am trying to use keras.backend
but I received the following error
ValueError: Input tensors to a Functional must come from `tf.keras.Input`.
Received: 0 (missing previous layer metadata).
Could anyone please help me with this issue?
I find that with Tensorflow version 2.5, it's much easier. Just call the model like this:
model(X, training=True)
That's it! (This also works for models that were loaded from disk)
I find that with Tensorflow version 2.5, it's much easier. Just call the model like this:
model(X, training=True)
That's it! (This also works for models that were loaded from disk)
it also work in TF 2.3.
I'm a bit sceptical about the proposed solutions to enable training mode for the entire network and not just for the dropout layers. My understanding is that this means other layers will be affected as well which might have side-effects: If BatchNorm is activated during MC inference, you would update the layer statistics every single time you run the forward pass. So the only correct solutions here here are to only modify the dropout layers. The other solutions only work for networks without batchnorm. Please correct me if I'm wrong!
As mentioned in issue #5357 (https://github.com/keras-team/keras/issues/5357#issuecomment-350276900) by @spearsem and @alexchao56 it would be nice if we could enable dropout in the prediction stage of the model and not just in training.
There is solid work motivating this use case as an approximation to Bayesian deep learning http://proceedings.mlr.press/v48/gal16.pdf (in this case as a variational approximation to deep GPs).
Ideally one would be able to run predict multiple times and use the expected value of these predictions as an estimate of the overall prediction and its std to quantify the uncertainty around the prediction.
Other than the feature request, is there a way to possibly go around the current setup in Keras to achieve this ?