How to set the learning rate ?

HuangZuShu commented 3 years ago

Hello, I would like to know what is the basis for your fine-tuning of the learning rate ? The experiment ? Each adjustment of a parameter, an experiment is performed ？

return [
            {'params': first_conv_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "first_conv_weight"},
            {'params': first_conv_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "first_conv_bias"},
            {'params': normal_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "normal_weight"},
            {'params': normal_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "normal_bias"},
            {'params': bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "BN scale/shift"},
            {'params': custom_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "custom_weight"},
            {'params': custom_bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "custom_bn"},
            # for fc
            {'params': lr5_weight, 'lr_mult': 5, 'decay_mult': 1, 'name': "lr5_weight"},
            {'params': lr10_bias, 'lr_mult': 10, 'decay_mult': 0, 'name': "lr10_bias"},
        ]

`

villawang commented 3 years ago

Hello, I would like to know what is the basis for your fine-tuning of the learning rate ? The experiment ? Each adjustment of a parameter, an experiment is performed ？

return [
            {'params': first_conv_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "first_conv_weight"},
            {'params': first_conv_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "first_conv_bias"},
            {'params': normal_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "normal_weight"},
            {'params': normal_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "normal_bias"},
            {'params': bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "BN scale/shift"},
            {'params': custom_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "custom_weight"},
            {'params': custom_bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "custom_bn"},
            # for fc
            {'params': lr5_weight, 'lr_mult': 5, 'decay_mult': 1, 'name': "lr5_weight"},
            {'params': lr10_bias, 'lr_mult': 10, 'decay_mult': 0, 'name': "lr10_bias"},
        ]

`

Hi there,

The base learning rate is already defined in the training script in .sh file. The code you refer here is the learning rate and weight decay based on the setting in .sh file for different layers.

HuangZuShu commented 3 years ago

Thanks for your reply!

V-Sense / ACTION-Net

How to set the learning rate ? #4