wujiyang / Face_Pytorch

face recognition algorithms in pytorch framework, including arcface, cosface, sphereface and so on
Apache License 2.0
808 stars 156 forks source link

学习率0.00001?是不是有点小啊 #25

Open ZCCDL opened 5 years ago

wujiyang commented 5 years ago

初始0.1,一般到0.01, 0.001, 0.0001就可以了

DietDietDiet commented 5 years ago

@wujiyang Iters: 024700/[13], loss: 19.7619, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Iters: 024800/[13], loss: 19.6427, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Iters: 024900/[13], loss: 20.2305, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Train Epoch: 14/22 ... Iters: 025000/[14], loss: 19.5610, train_accuracy: 0.0000, time: 0.06 s/iter, learning rate: 0.00010000000000000003 Iters: 025100/[14], loss: 19.1001, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025200/[14], loss: 19.3134, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025300/[14], loss: 18.5973, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025400/[14], loss: 18.6333, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025500/[14], loss: 19.1020, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025600/[14], loss: 19.2770, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025700/[14], loss: 18.8121, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025800/[14], loss: 19.1786, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025900/[14], loss: 18.9957, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026000/[14], loss: 19.3130, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026100/[14], loss: 18.5452, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026200/[14], loss: 19.3583, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026300/[14], loss: 18.9004, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026400/[14], loss: 19.1668, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026500/[14], loss: 19.0916, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026600/[14], loss: 19.5528, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026700/[14], loss: 20.0253, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026800/[14], loss: 19.5178, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Train Epoch: 15/22 ... Iters: 026900/[15], loss: 19.6979, train_accuracy: 0.0000, time: 0.02 s/iter, learning rate: 0.0010000000000000002

看了一下显示的learning rate好像并不是按照multistepLR预期的方式来减小的。milestone那里好像learning rate 变成0.0001了。麻烦作者解惑,多谢~

xinyikb commented 4 years ago

exp_lr_scheduler.step()

@wujiyang Iters: 024700/[13], loss: 19.7619, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Iters: 024800/[13], loss: 19.6427, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Iters: 024900/[13], loss: 20.2305, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.010000000000000002 Train Epoch: 14/22 ... Iters: 025000/[14], loss: 19.5610, train_accuracy: 0.0000, time: 0.06 s/iter, learning rate: 0.00010000000000000003 Iters: 025100/[14], loss: 19.1001, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025200/[14], loss: 19.3134, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025300/[14], loss: 18.5973, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025400/[14], loss: 18.6333, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025500/[14], loss: 19.1020, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025600/[14], loss: 19.2770, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025700/[14], loss: 18.8121, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025800/[14], loss: 19.1786, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 025900/[14], loss: 18.9957, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026000/[14], loss: 19.3130, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026100/[14], loss: 18.5452, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026200/[14], loss: 19.3583, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026300/[14], loss: 18.9004, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026400/[14], loss: 19.1668, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026500/[14], loss: 19.0916, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026600/[14], loss: 19.5528, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026700/[14], loss: 20.0253, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Iters: 026800/[14], loss: 19.5178, train_accuracy: 0.0000, time: 0.17 s/iter, learning rate: 0.00010000000000000003 Train Epoch: 15/22 ... Iters: 026900/[15], loss: 19.6979, train_accuracy: 0.0000, time: 0.02 s/iter, learning rate: 0.0010000000000000002

看了一下显示的learning rate好像并不是按照multistepLR预期的方式来减小的。milestone那里好像learning rate 变成0.0001了。麻烦作者解惑,多谢~

你的exp_lr_scheduler.step() 放错位置了!

willard-yuan commented 4 years ago

Changing the train.py#L164 of exp_lr_scheduler.get_lr()[0] to optimizer_ft.param_groups[0]['lr'] will fix the bug.

ypw-lbj commented 4 years ago

请问您使用预训练模型了吗? Have you used the pre training model?