Is it possible that each objective function just trains a specific set of layers?

keras-team / keras

Deep Learning for humans

http://keras.io/

Apache License 2.0

61.98k stars 19.46k forks source link

Is it possible that each objective function just trains a specific set of layers? #4312

Closed hadikazemi closed 7 years ago

hadikazemi commented 7 years ago

I have two outputs and two objective functions in my network. I want the first objective function just trains a set of layers (e.g. the first 5 layers) while the second one trains the rest. Is that possible to assign layers to an objective function? or the only solution is to switch between objective function and recompile the model in each iteration and change the "trainable" property of each layer in between?

pechersky commented 7 years ago

If I understood you correctly, the following example might help: https://github.com/fchollet/keras/blob/master/examples/variational_autoencoder.py

You'll notice that vae_loss calculates two losses, one on the full output compared to the input (xent_loss) and one on just two intermediate layers (kl_loss). The full loss is their sum.

hadikazemi commented 7 years ago

Thank you, But what I am looking for is a little bit different. Consider a VGG. I have two losses on the softmax classifier. One of them must just train the fully connected layers, and the other one must train the Conv layers. However, If I add two losses to each other, the backpropagation automatically trains all layers.

pechersky commented 7 years ago

Could you split the model into two models, and train the two separately, feeding the predict output of the first one into the input of the second one?

hadikazemi commented 7 years ago

The two models are supposed to have all weights in common and get train together. Here is an example of how I can do it in Tensorflow: http://stackoverflow.com/questions/34945554/how-to-set-layer-wise-learning-rate-in-tensorflow I think it is time to move on to Tensorflow.

pechersky commented 7 years ago

Since Keras wraps TF, you could probably get the same functionality. Given your SO example, you could try writing a custom optimizer that runs get_gradients on solely the losses and layers you are interested in.