yjxiong / tsn-pytorch

Temporal Segment Networks (TSN) in PyTorch
BSD 2-Clause "Simplified" License
1.07k stars 308 forks source link

Parameters optimize policy #75

Closed zeal-up closed 5 years ago

zeal-up commented 5 years ago

Thanks for your jobs! This repo is well organized and help me a lot! I notice that the parameters optimize policy is quite complex in the model: (UCF01, Resnet101, 4Gpus, batch-size=64)

 return [
            {'params': first_conv_weight, 'lr_mult': 5 if self.modality == 'Flow' else 1, 'decay_mult': 1,
             'name': "first_conv_weight"},
            {'params': first_conv_bias, 'lr_mult': 10 if self.modality == 'Flow' else 2, 'decay_mult': 0,
             'name': "first_conv_bias"},
            {'params': normal_weight, 'lr_mult': 1, 'decay_mult': 1,
             'name': "normal_weight"},
            {'params': normal_bias, 'lr_mult': 2, 'decay_mult': 0,
             'name': "normal_bias"},
            {'params': bn, 'lr_mult': 1, 'decay_mult': 0,
             'name': "BN scale/shift"},
        ]

I want to know that is it necessary to get desired results? It seems that this tricks did't mentioned in the original paper? Have you done such experiment?

The training procedure seems unstable, in the first iteration of some epochs, the loss will explosion, maybe due to gradient explosion, but even I set the clip_gradient, the instability still exits.

yjxiong commented 5 years ago

These policies are not necessary in most cases. It’s just some legacy setting.

zgyangleo commented 5 years ago

These policies are not necessary in most cases. It’s just some legacy setting.

hi, i finetune the last fc layer in the TSN, i want to know how to set optim.SGD, my way just like this

optimizer = torch.optim.SGD([
                                                 {'params': model.module.fc.parameters()}
                                                ],
                                                policies,
                                                args.lr,
                                                momentum=args.momentums,
                                                weight_decay=args.weight_decay)

but it's wrong: init() got multiple values for argument 'momentum'. Could you give me some advice?