Closed tuming1990 closed 7 years ago
self.non_trainable_weights
?
I think this doesn't work according to https://github.com/fchollet/keras/issues/2395
I think that you can use a custom constraint
Node(outbound_layer=self,
inbound_layers=[],
node_indices=[],
tensor_indices=[],
input_tensors=self.inputs,
output_tensors=self.outputs,
# no model-level masking for now
input_masks=[None for _ in self.inputs],
output_masks=[None],
input_shapes=[x._keras_shape for x in self.inputs],
output_shapes=[self.outputs[0]._keras_shape])
Seems model-level masking is a future plan, right? @fchollet
This seems to be a limitation with TensorFlow. The limitation I keep running into is that when the variable is added to the graph, you must define a shape that works for an entire layer. Something like this:
initial = tf.truncated_normal(shape, stddev=0.1, name='W_conv2')
W_conv2 = tf.Variable(initial)
where W_conv2 may have a shape something like [3, 3, 32, 32]. Freezing one of those 3x3 kernels (or one pixel of one of those kernels) while keeping the others trainable is not straightforward. I've got a similar question posted on stackoverflow.
I think this feature is important. Does anyone find out how to do so?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.
Still having a similar issue. Any ideas?
This is one reason that implementation of neural networks within hardware or firmware continues to be an issue: network pruning is very limited. It's unfortunate stale bot intervened while the real issue remains open. Or perhaps it's been solved and not shared.
Maybe the tf.boolean_mask could do this trick.
Is it solved? I have the same problem_
how to realize by caffe
@tuming1990
http://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers may be helpful.
This may well be exactly what you need.
It certainly is exactly what I needed.
http://keras.io/getting-started/faq/#how-can-i-freeze-keras-layers may be helpful.
This may well be exactly what you need.
It certainly is exactly what I needed.
Actually @tuming1990 asked for fine grained pruning, the link provided by @henry0312 is about coarse grained pruning. It is only possible to freeze the layers with the trainable
parameter. However the author clearly asked for a way to freeze any weight in the tensors.
Does anybody have an update about element-wise weight freezing ?
Is this solved? It is very handy to keep some weights (not necessarily the whole layer) fixed.
Any update? I am looking to do this as well. At the very least, I'd like to set certain weights in a layer to be what I want; not all the weights of a layer or in a model.
If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).
from keras.initializers import Initializer
class FixSlice(Initializer):
"""
Initializer which forces a certain slice to be chosen values
INPUTS:
values - An object which can be converted into a numpy ndarray. These are
the pre-chosen values. When using this initializer, the user should
ensure that the dtype of values can be converted to the desired
dtype of the weight tensor.
slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
specify this parameter). This specifies which entries should be
filled with the pre-chosen values. When using this initializer,
the user should ensure that the slice object "fits inside" the shape
of the tensor to be initialized, and that the resulting slice of the
tensor has the same shape as the values ndarray.
backup - An initializer instance. The remaining values are filled using this
initializer.
"""
def __init__(self, values, slice, backup="glorot_uniform"):
if hasattr(values, "numpy"):
self.values = values.numpy()
elif isinstance(values, np.ndarray):
self.values = values
else:
try:
self.values = values.to_numpy()
except:
self.values = np.array(values)
self.values = values
self.slice = slice
self.backup = initializers.get(backup)
def __call__(self, shape, dtype=None):
result = self.backup(shape, dtype=dtype).numpy()
result[self.slice] = self.values
return tf.Variable(result)
from keras.constraints import Constraint
class FreezeSlice(Constraint):
"""
Constraint which keeps a certain slice frozen at chosen values
INPUTS:
values - An object which can be converted into a numpy ndarray. These are
the pre-chosen values. When using this constraint, the user should
ensure that the dtype of values can be converted to the desired
dtype of the weight tensor.
slice - A slice or tuple of slices (it is recommended to use numpy.s_ to
specify this parameter). This specifies which entries should be
filled with the pre-chosen values. When using this initializer,
the user should ensure that the slice object "fits inside" the shape
of the tensor to be initialized, and that the resulting slice of the
tensor has the same shape as the values ndarray.
"""
def __init__(self, values, slice):
if hasattr(values, "numpy"):
self.values = values.numpy()
elif isinstance(values, np.ndarray):
self.values = values
else:
try:
self.values = values.to_numpy()
except:
self.values = np.array(values)
self.values = values
self.slice = slice
def __call__(self, w):
zs = np.zeros(w.shape)
zs[self.slice] = self.values
os = np.ones(w.shape)
os[self.slice] = 0
return w * os + zs
If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).
from keras.initializers import Initializer class FixSlice(Initializer): """ Initializer which forces a certain slice to be chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this initializer, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. backup - An initializer instance. The remaining values are filled using this initializer. """ def __init__(self, values, slice, backup="glorot_uniform"): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice self.backup = initializers.get(backup) def __call__(self, shape, dtype=None): result = self.backup(shape, dtype=dtype).numpy() result[self.slice] = self.values return tf.Variable(result)
from keras.constraints import Constraint class FreezeSlice(Constraint): """ Constraint which keeps a certain slice frozen at chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this constraint, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. """ def __init__(self, values, slice): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice def __call__(self, w): zs = np.zeros(w.shape) zs[self.slice] = self.values os = np.ones(w.shape) os[self.slice] = 0 return w * os + zs
Can you provide specific example of it's application?
If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).
from keras.initializers import Initializer class FixSlice(Initializer): """ Initializer which forces a certain slice to be chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this initializer, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. backup - An initializer instance. The remaining values are filled using this initializer. """ def __init__(self, values, slice, backup="glorot_uniform"): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice self.backup = initializers.get(backup) def __call__(self, shape, dtype=None): result = self.backup(shape, dtype=dtype).numpy() result[self.slice] = self.values return tf.Variable(result)
from keras.constraints import Constraint class FreezeSlice(Constraint): """ Constraint which keeps a certain slice frozen at chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this constraint, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. """ def __init__(self, values, slice): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice def __call__(self, w): zs = np.zeros(w.shape) zs[self.slice] = self.values os = np.ones(w.shape) os[self.slice] = 0 return w * os + zs
Can you provide specific example of it's application?
model.add(Dense(1,input_dim =2,kernelconstraint=FreezeSlice([1],np.s[0]))) In this example, I set the value of the first weight to 1.
If anyone is still watching this, I have two classes (an initializer and a constraint) which may solve this problem. I have only tested them on my use case so take them with a grain of salt. They allow for initializing any slice of a given weight tensor to chosen values (FixSlice) and freezing any slice of a given weight tensor to chosen values (FreezeSlice).
from keras.initializers import Initializer class FixSlice(Initializer): """ Initializer which forces a certain slice to be chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this initializer, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. backup - An initializer instance. The remaining values are filled using this initializer. """ def __init__(self, values, slice, backup="glorot_uniform"): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice self.backup = initializers.get(backup) def __call__(self, shape, dtype=None): result = self.backup(shape, dtype=dtype).numpy() result[self.slice] = self.values return tf.Variable(result)
from keras.constraints import Constraint class FreezeSlice(Constraint): """ Constraint which keeps a certain slice frozen at chosen values INPUTS: values - An object which can be converted into a numpy ndarray. These are the pre-chosen values. When using this constraint, the user should ensure that the dtype of values can be converted to the desired dtype of the weight tensor. slice - A slice or tuple of slices (it is recommended to use numpy.s_ to specify this parameter). This specifies which entries should be filled with the pre-chosen values. When using this initializer, the user should ensure that the slice object "fits inside" the shape of the tensor to be initialized, and that the resulting slice of the tensor has the same shape as the values ndarray. """ def __init__(self, values, slice): if hasattr(values, "numpy"): self.values = values.numpy() elif isinstance(values, np.ndarray): self.values = values else: try: self.values = values.to_numpy() except: self.values = np.array(values) self.values = values self.slice = slice def __call__(self, w): zs = np.zeros(w.shape) zs[self.slice] = self.values os = np.ones(w.shape) os[self.slice] = 0 return w * os + zs
Can you provide specific example of it's application?
model.add(Dense(1,input_dim =2,kernelconstraint=FreezeSlice([1],np.s[0]))) In this example, I set the value of the first weight to 1.
Thank you, it seems to be working for a single dimension, but I can't get it working on a multidimensional array. The constraint freezes always a full row of weights and I didn't find a way to use it on a single element. What I would like to obtain is something freely addressable, for example:
t1 = tf.constant([[0,0,0],[0,0,0],[0,0,0]])
t2 = tf.tensor_scatter_nd_update(t1, indices=[[0,0],[1,1],[2,2]], updates=[10,10,10])
will give as a result
[10 0 0]
[0 10 0]
[0 0 10]
Do you think it is possible with something similar to your implementation?
Hey, if anyone is still reading this. It seems like it's impossible to set specific weights inside a layer to untrainable, you either set the whole layer trainable or untrainable. Instead, I built a custom Constraint and Initializer class to freeze specific weights inside a 2D convolution filter, inspired by @Doekeb. For example if I'm using a 3x3 filter and I want to freeze all the weights in the corner, i'll create a mask like this : [[False, True, False], [True, True, True], [False, True, False]] where True indicates that the weight is trainable, and False means frozen.
class FreezeConv(Constraint):
'''
Inherits from keras.constraints.Constraint.
This class freezes specific kernels weights to 0 after the back propagation according to the mask parameter.
'''
def __init__(self, mask, in_channel_axis, out_channel_axis):
self.mask = mask
self.in_channel_axis = in_channel_axis
self.out_channel_axis = out_channel_axis
def __call__(self, w):
nb_out_channel = w.shape[self.out_channel_axis]
nb_in_channel = w.shape[self.in_channel_axis]
# A (3x3) kernel in reality has weights of shape (3,3,input_channels,output_channels)
# while the mask has shape (3,3). We do a tf.repeat and reshape to the right size, then multiply it to the weights w.
try :
reshaped_mask = tf.reshape(
tf.repeat(self.mask, nb_in_channel*nb_out_channel),
shape=w.shape
)
return w * tf.cast(reshaped_mask, dtype=w.dtype)
except :
print("error")
print(nb_in_channel, nb_out_channel)
print(tf.repeat(self.mask, nb_in_channel*nb_out_channel).shape)
return w
class FreezeInit(Initializer):
'''
Inherits from keras.initializers.Initializer.
This class initializes random kernel weights following a gaussian distribution.
But it sets some specific weights within the kernel to 0 according to the mask parameter.
'''
def __init__(self, mask):
self.mask = mask
# track the number of frozen weights
self.frozen_weights = (1-self.mask).sum()
def __call__(self, shape, dtype=None):
try :
kernel_height, kernel_width, in_filters, out_filters = shape
fan_out = int(kernel_height * kernel_width * out_filters)
reshaped_mask = tf.reshape(
tf.repeat(self.mask, in_filters*out_filters),
shape=shape
)
self.frozen_weights *= in_filters*out_filters
return tf.random.normal(shape, mean=0.0, stddev=np.sqrt(2.0 / fan_out), dtype=dtype) * tf.cast(reshaped_mask, dtype=dtype)
except :
print("error")
For example :
mask = [
[False, True, False],
[True, True, True],
[False, True, False]
]
Freeze_ = FreezeConv(mask)
FreezeInit_ = FreezeInit(mask)
MyConv2D = Conv2D(32, filter_size = (3,3), strides=(1,1), padding='same', kernel_initializer=FreezeInit_, kernel_constraint=Freeze_)
should set the corners to 0 at initialization thanks to FreezeInit, and also after every back propagation step thanks to Freeze_. One cool thing is that it reduces slightly overfitting in specific cases, where you know for certain that the corner weights (or at any other position of the filter) aren't relevant for your model. And you're also technically using less trainable weights that what model.summary() says you are.
Note : I used a very specific code for my own purposes, this might not work everywhere, feel free to adapt the code. I tried this on normal conv2D, DepthwiseConv2D, and grouped convolutions, it worked just fine. Note 2 : The ideal way would be to freeze certains weights directly when defining the layers, so we don't waste computation time to back propagate gradients then reset to 0 like I did. But it doesn't possible in Keras atm.
I want to keep some weights fixed during training the neural network, which means not updating these weights since they are initialized.
''Some weights'' means some values in weight matrices, not specific rows or columns or weight matrix of a specific layer. They can be any element in weight matrices.
Is there a way to do this in Keras? I know Caffe can do this by setting a mask to the weight matrix so the masked weight will not affect the output.