keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.86k stars 19.44k forks source link

Is it possible to keep some weights fixed during training #2880

Closed tuming1990 closed 7 years ago

tuming1990 commented 8 years ago

I want to keep some weights fixed during training the neural network, which means not updating these weights since they are initialized.

''Some weights'' means some values in weight matrices, not specific rows or columns or weight matrix of a specific layer. They can be any element in weight matrices.

Is there a way to do this in Keras? I know Caffe can do this by setting a mask to the weight matrix so the masked weight will not affect the output.

xingdi-eric-yuan commented 8 years ago

self.non_trainable_weights ?

tuming1990 commented 8 years ago

I think this doesn't work according to https://github.com/fchollet/keras/issues/2395

sileod commented 8 years ago

I think that you can use a custom constraint

henry0312 commented 8 years ago

http://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers may be helpful.

tuming1990 commented 8 years ago
            Node(outbound_layer=self,
                 inbound_layers=[],
                 node_indices=[],
                 tensor_indices=[],
                 input_tensors=self.inputs,
                 output_tensors=self.outputs,
                 # no model-level masking for now
                 input_masks=[None for _ in self.inputs],
                 output_masks=[None],
                 input_shapes=[x._keras_shape for x in self.inputs],
                 output_shapes=[self.outputs[0]._keras_shape])

Seems model-level masking is a future plan, right? @fchollet

harchankoj commented 7 years ago

This seems to be a limitation with TensorFlow. The limitation I keep running into is that when the variable is added to the graph, you must define a shape that works for an entire layer. Something like this:

initial = tf.truncated_normal(shape, stddev=0.1, name='W_conv2') W_conv2 = tf.Variable(initial)

where W_conv2 may have a shape something like [3, 3, 32, 32]. Freezing one of those 3x3 kernels (or one pixel of one of those kernels) while keeping the others trainable is not straightforward. I've got a similar question posted on stackoverflow.

http://stackoverflow.com/questions/42517926/how-to-freeze-lock-weights-of-one-tensorflow-variable-e-g-one-cnn-kernel-of-o

YuanhaoGong commented 7 years ago

I think this feature is important. Does anyone find out how to do so?

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

caugusta commented 6 years ago

Still having a similar issue. Any ideas?

harchankoj commented 6 years ago

This is one reason that implementation of neural networks within hardware or firmware continues to be an issue: network pruning is very limited. It's unfortunate stale bot intervened while the real issue remains open. Or perhaps it's been solved and not shared.

Prof-pengyin commented 6 years ago

Maybe the tf.boolean_mask could do this trick.

SunTongtongtong commented 6 years ago

Is it solved? I have the same problem_

theforthforgotten commented 5 years ago

how to realize by caffe

theforthforgotten commented 5 years ago

@tuming1990

otdewiljes commented 5 years ago

http://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers may be helpful.

This may well be exactly what you need.

It certainly is exactly what I needed.

N0ciple commented 5 years ago

http://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers may be helpful.

This may well be exactly what you need.

It certainly is exactly what I needed.

Actually @tuming1990 asked for fine grained pruning, the link provided by @henry0312 is about coarse grained pruning. It is only possible to freeze the layers with the trainable parameter. However the author clearly asked for a way to freeze any weight in the tensors.

Does anybody have an update about element-wise weight freezing ?

fqassemi commented 4 years ago

Is this solved? It is very handy to keep some weights (not necessarily the whole layer) fixed.

ryanmaxwell96 commented 4 years ago

Any update? I am looking to do this as well. At the very least, I'd like to set certain weights in a layer to be what I want; not all the weights of a layer or in a model.

Doekeb commented 4 years ago

If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).

from keras.initializers import Initializer

class FixSlice(Initializer):
    """
    Initializer which forces a certain slice to be chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this initializer, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.

    backup - An initializer instance. The remaining values are filled using this
             initializer.
    """
    def __init__(self, values, slice, backup="glorot_uniform"):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice
        self.backup = initializers.get(backup)

    def __call__(self, shape, dtype=None):
        result = self.backup(shape, dtype=dtype).numpy()
        result[self.slice] = self.values
        return tf.Variable(result)
from keras.constraints import Constraint

class FreezeSlice(Constraint):
    """
    Constraint which keeps a certain slice frozen at chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this constraint, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.
    """
    def __init__(self, values, slice):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice

    def __call__(self, w):
        zs = np.zeros(w.shape)
        zs[self.slice] = self.values
        os = np.ones(w.shape)
        os[self.slice] = 0
        return w * os + zs
saroj7pathak commented 3 years ago

If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).

from keras.initializers import Initializer

class FixSlice(Initializer):
    """
    Initializer which forces a certain slice to be chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this initializer, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.

    backup - An initializer instance. The remaining values are filled using this
             initializer.
    """
    def __init__(self, values, slice, backup="glorot_uniform"):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice
        self.backup = initializers.get(backup)

    def __call__(self, shape, dtype=None):
        result = self.backup(shape, dtype=dtype).numpy()
        result[self.slice] = self.values
        return tf.Variable(result)
from keras.constraints import Constraint

class FreezeSlice(Constraint):
    """
    Constraint which keeps a certain slice frozen at chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this constraint, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.
    """
    def __init__(self, values, slice):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice

    def __call__(self, w):
        zs = np.zeros(w.shape)
        zs[self.slice] = self.values
        os = np.ones(w.shape)
        os[self.slice] = 0
        return w * os + zs

Can you provide specific example of it's application?

NioushaBagheri commented 3 years ago

If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).

from keras.initializers import Initializer

class FixSlice(Initializer):
    """
    Initializer which forces a certain slice to be chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this initializer, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.

    backup - An initializer instance. The remaining values are filled using this
             initializer.
    """
    def __init__(self, values, slice, backup="glorot_uniform"):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice
        self.backup = initializers.get(backup)

    def __call__(self, shape, dtype=None):
        result = self.backup(shape, dtype=dtype).numpy()
        result[self.slice] = self.values
        return tf.Variable(result)
from keras.constraints import Constraint

class FreezeSlice(Constraint):
    """
    Constraint which keeps a certain slice frozen at chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this constraint, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.
    """
    def __init__(self, values, slice):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice

    def __call__(self, w):
        zs = np.zeros(w.shape)
        zs[self.slice] = self.values
        os = np.ones(w.shape)
        os[self.slice] = 0
        return w * os + zs

Can you provide specific example of it's application?

model.add(Dense(1,input_dim =2,kernelconstraint=FreezeSlice([1],np.s[0]))) In this example, I set the value of the first weight to 1.

artemglukhov commented 2 years ago

If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).

from keras.initializers import Initializer

class FixSlice(Initializer):
    """
    Initializer which forces a certain slice to be chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this initializer, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.

    backup - An initializer instance. The remaining values are filled using this
             initializer.
    """
    def __init__(self, values, slice, backup="glorot_uniform"):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice
        self.backup = initializers.get(backup)

    def __call__(self, shape, dtype=None):
        result = self.backup(shape, dtype=dtype).numpy()
        result[self.slice] = self.values
        return tf.Variable(result)
from keras.constraints import Constraint

class FreezeSlice(Constraint):
    """
    Constraint which keeps a certain slice frozen at chosen values

    INPUTS:

    values - An object which can be converted into a numpy ndarray. These are
             the pre-chosen values. When using this constraint, the user should
             ensure that the dtype of values can be converted to the desired
             dtype of the weight tensor.

    slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
            specify this parameter). This specifies which entries should be
            filled with the pre-chosen values. When using this initializer,
            the user should ensure that the slice object "fits inside" the shape
            of the tensor to be initialized, and that the resulting slice of the
            tensor has the same shape as the values ndarray.
    """
    def __init__(self, values, slice):
        if hasattr(values, "numpy"):
            self.values = values.numpy()
        elif isinstance(values, np.ndarray):
            self.values = values
        else:
            try:
                self.values = values.to_numpy()
            except:
                self.values = np.array(values)

        self.values = values
        self.slice = slice

    def __call__(self, w):
        zs = np.zeros(w.shape)
        zs[self.slice] = self.values
        os = np.ones(w.shape)
        os[self.slice] = 0
        return w * os + zs

Can you provide specific example of it's application?

model.add(Dense(1,input_dim =2,kernelconstraint=FreezeSlice([1],np.s[0]))) In this example, I set the value of the first weight to 1.

Thank you, it seems to be working for a single dimension, but I can't get it working on a multidimensional array. The constraint freezes always a full row of weights and I didn't find a way to use it on a single element. What I would like to obtain is something freely addressable, for example:

t1 = tf.constant([[0,0,0],[0,0,0],[0,0,0]])
t2 = tf.tensor_scatter_nd_update(t1, indices=[[0,0],[1,1],[2,2]], updates=[10,10,10])

will give as a result

[10 0 0]
[0 10 0]
[0 0 10]

Do you think it is possible with something similar to your implementation?

MichelHUANGGit commented 7 months ago

Hey, if anyone is still reading this. It seems like it's impossible to set specific weights inside a layer to untrainable, you either set the whole layer trainable or untrainable. Instead, I built a custom Constraint and Initializer class to freeze specific weights inside a 2D convolution filter, inspired by @Doekeb. For example if I'm using a 3x3 filter and I want to freeze all the weights in the corner, i'll create a mask like this : [[False, True, False], [True, True, True], [False, True, False]] where True indicates that the weight is trainable, and False means frozen.

class FreezeConv(Constraint):
    '''
    Inherits from keras.constraints.Constraint.
    This class freezes specific kernels weights to 0 after the back propagation according to the mask parameter.
    '''
    def __init__(self, mask, in_channel_axis, out_channel_axis):
        self.mask = mask
        self.in_channel_axis = in_channel_axis
        self.out_channel_axis = out_channel_axis

    def __call__(self, w):
        nb_out_channel = w.shape[self.out_channel_axis]
        nb_in_channel = w.shape[self.in_channel_axis]
        # A (3x3) kernel in reality has weights of shape (3,3,input_channels,output_channels)
        # while the mask has shape (3,3). We do a tf.repeat and reshape to the right size, then multiply it to the weights w.
        try :
          reshaped_mask = tf.reshape(
              tf.repeat(self.mask, nb_in_channel*nb_out_channel),
              shape=w.shape
          )
          return w * tf.cast(reshaped_mask, dtype=w.dtype)
        except :
          print("error")
          print(nb_in_channel, nb_out_channel)
          print(tf.repeat(self.mask, nb_in_channel*nb_out_channel).shape)
          return w

class FreezeInit(Initializer):
    '''
    Inherits from keras.initializers.Initializer.
    This class initializes random kernel weights following a gaussian distribution.
    But it sets some specific weights within the kernel to 0 according to the mask parameter.
    '''
    def __init__(self, mask):
        self.mask = mask
        # track the number of frozen weights
        self.frozen_weights = (1-self.mask).sum()

    def __call__(self, shape, dtype=None):
        try :
          kernel_height, kernel_width, in_filters, out_filters = shape
          fan_out = int(kernel_height * kernel_width * out_filters)
          reshaped_mask = tf.reshape(
              tf.repeat(self.mask, in_filters*out_filters),
              shape=shape
          )
          self.frozen_weights *= in_filters*out_filters
          return tf.random.normal(shape, mean=0.0, stddev=np.sqrt(2.0 / fan_out), dtype=dtype) * tf.cast(reshaped_mask, dtype=dtype)
        except :
          print("error")

For example :

mask = [
  [False, True, False],
  [True,  True,  True],
  [False, True,  False]
 ]
Freeze_ = FreezeConv(mask)
FreezeInit_ = FreezeInit(mask)
MyConv2D = Conv2D(32, filter_size = (3,3), strides=(1,1), padding='same', kernel_initializer=FreezeInit_, kernel_constraint=Freeze_)

should set the corners to 0 at initialization thanks to FreezeInit, and also after every back propagation step thanks to Freeze_. One cool thing is that it reduces slightly overfitting in specific cases, where you know for certain that the corner weights (or at any other position of the filter) aren't relevant for your model. And you're also technically using less trainable weights that what model.summary() says you are.

Note : I used a very specific code for my own purposes, this might not work everywhere, feel free to adapt the code. I tried this on normal conv2D, DepthwiseConv2D, and grouped convolutions, it worked just fine. Note 2 : The ideal way would be to freeze certains weights directly when defining the layers, so we don't waste computation time to back propagate gradients then reset to 0 like I did. But it doesn't possible in Keras atm.