hujie-frank / SENet

Squeeze-and-Excitation Networks
Apache License 2.0
3.38k stars 839 forks source link

What is your weight_decay parameter ?^_^ #37

Closed hungsing92 closed 6 years ago

hungsing92 commented 6 years ago

Hi Hujie:

First of all, thanks for your excellent work! I can't find the weightdecay parameter in your paper, would you please tell me? ^^

Best regards, hungsing

longmao-yiran commented 6 years ago

hello: As the author said in the paper"The initial learning rate is set to 0.6 and decreased by a factor of 10 every 30 epochs.All models are trained for 100 epochs from scratch, using the weight initialisation strategy described in[8]",when I read the paper witch called "Delving deep into rectifiers: Surpassing human-level performance on ImageNet" in " https://arxiv.org/pdf/1502.01852.pdf" think author use the weight_decay is 0.0005

hujie-frank commented 6 years ago

We follow the setting of weight_decay with their non-SE counterpart (e.g. 0.0001 for SE-ResNet, 0.0002 for SE-BN-Inception).