Is there any way to set different learning rates for different different layers of a CNN model in Keras?

gyheart commented 5 years ago

Is there any way to set different learning rates for different different layers of a CNN model in Keras? Thanks.

mha-py commented 5 years ago

There exists an adam modification https://erikbrorson.github.io/2018/04/30/Adam-with-learning-rate-multipliers/

gabrieldemarmiesse commented 5 years ago

It would be nice to have a wrapper around existing optimizers to perform this task. We could put this wrapper in keras-contrib. @erikbrorson would you be interested in implementing such a wrapper for keras-contrib?

stante commented 5 years ago

I was curious regarding this issue. Also https://github.com/keras-team/keras/issues/7912 mentions something related.

I've implemented a wrapper which can be found here: https://github.com/stante/keras-contrib/blob/feature-lr-multiplier/keras_contrib/optimizers/lr_multiplier.py

It basically implements the get_updates of an optimizer to set the per layer factors. Furthermore, the attribute accesses are forwarded to the wrapped optimizer to play nice with for example learning rate altering mechanisms like LearningRateScheduler.

It can be used like this:

multipliers = {'dense_1': 0.5, 'dense_2': 0.4}
opt = LearningRateMultiplier(SGD, lr_multiplier=multipliers, lr=0.001, momentum=0.9)

The optimizer is then instantiated with the remaining arguments, in this case lr and momentum. Furthermore, if necessary different factors for the kernel and bias can be given.

multipliers = {'dense_1/kernel': 0.5, 'dense_1/bias': 0.1}

If there is interest in that, I can polish it a bit and create a pull request. I am open to change the name of the class or other naming related things if it then fits better keras or keras-contrib (maybe it should be lr_multipliers instead of lr_multiplier). I have created some basic tests which can also be found as well in my repository.

gabrieldemarmiesse commented 5 years ago

You can make a pull request to keras contrib. Thanks for your work.

ngonthier commented 4 years ago

It seems to cause an important memory overhead. I try with a ResNet 50 with all the layer associated with a 0.1 multiplier excepted the last one. Do you have any idea how to fix it ?

Damcy commented 2 years ago

tfa.optimizers.MultiOptimizer

Pragmatism0220 commented 1 year ago

Hello, I have a problem. Why the optimizer is self._class here and not self._optimizer? When I use keras.callbacks.EarlyStopping with LearningRateMultiplier, I got TypeError: get_config() missing 1 required positional argument: 'self' after epoch 1.
def get_config(self):
config = {'optimizer': self._class,
'lr_multipliers': self._lr_multipliers}
base_config = self._optimizer.get_config()
return dict(list(base_config.items()) + list(config.items()))

keras-team / keras

Is there any way to set different learning rates for different different layers of a CNN model in Keras? #11934