V-Sense / ACTION-Net

Official PyTorch implementation of ACTION-Net: Multipath Excitation for Action Recognition (CVPR'21)
MIT License
198 stars 45 forks source link

How to set the learning rate ? #4

Closed HuangZuShu closed 3 years ago

HuangZuShu commented 3 years ago

Hello, I would like to know what is the basis for your fine-tuning of the learning rate ? The experiment ? Each adjustment of a parameter, an experiment is performed ?

return [
            {'params': first_conv_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "first_conv_weight"},
            {'params': first_conv_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "first_conv_bias"},
            {'params': normal_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "normal_weight"},
            {'params': normal_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "normal_bias"},
            {'params': bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "BN scale/shift"},
            {'params': custom_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "custom_weight"},
            {'params': custom_bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "custom_bn"},
            # for fc
            {'params': lr5_weight, 'lr_mult': 5, 'decay_mult': 1, 'name': "lr5_weight"},
            {'params': lr10_bias, 'lr_mult': 10, 'decay_mult': 0, 'name': "lr10_bias"},
        ]

`

`

villawang commented 3 years ago

Hello, I would like to know what is the basis for your fine-tuning of the learning rate ? The experiment ? Each adjustment of a parameter, an experiment is performed ?

return [
            {'params': first_conv_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "first_conv_weight"},
            {'params': first_conv_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "first_conv_bias"},
            {'params': normal_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "normal_weight"},
            {'params': normal_bias, 'lr_mult': 2, 'decay_mult': 0, 'name': "normal_bias"},
            {'params': bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "BN scale/shift"},
            {'params': custom_weight, 'lr_mult': 1, 'decay_mult': 1, 'name': "custom_weight"},
            {'params': custom_bn, 'lr_mult': 1, 'decay_mult': 0, 'name': "custom_bn"},
            # for fc
            {'params': lr5_weight, 'lr_mult': 5, 'decay_mult': 1, 'name': "lr5_weight"},
            {'params': lr10_bias, 'lr_mult': 10, 'decay_mult': 0, 'name': "lr10_bias"},
        ]

`

`

Hi there,

The base learning rate is already defined in the training script in .sh file. The code you refer here is the learning rate and weight decay based on the setting in .sh file for different layers.

HuangZuShu commented 3 years ago

Thanks for your reply!