Closed cvtower closed 5 years ago
on my 1080Ti, ir_se50 model uses batchsize 100, mobilefacenet uses 200, and you can first find the best batchsize, then use find_lr to locate best lr in your machine, the default lr I set is 1e-3
conf.milestones was the first time approach I tried, in the final version I just go with the pytorch's default lr scheduler, you can find following line in learner.py self.scheduler = optim.lr_scheduler.ReduceLROnPlateau(self.optimizer, patience=40, verbose=True) this means when val dataset score have no more improve after 40 intervals, then decay the lr to 1/10,
actually, the model is really quite easy to train, if you give it a certain long training epoch, I think the performance should be good enough
Hi @TreB1eN ,
Nice work!
I directly use the mobilefacenet as the baseline of my work(a novel efficient cnn arch), and got better result. I will try to reproduce the original mobilefacenet of this repo.
Thanks very much for your reply!
Hi @TreB1eN,
Just a feedback here:
I have changed the training method in the latest commit, maybe lr decay with some milestone point setup are still better(I gave up the optim.lr_scheduler.ReduceLROnPlateau), you can have a look, if you achieved better performance using this repo, plz share you training parameters here
@TreB1eN ,
Got that. Thanks very much for your contribution!
@TreB1eN ,
My result for mobilefacenet: agedb_30:95.67 cfp_fp:90.50 lfw:99.45 batch_size 256
Thanks very much for your help!
@cvtower , could you please share your training parameters?
@cvtower , could you please share your training parameters?
@puppet101 ,
training set: faces_emore modifications: training epochs = 8 conf.batch_size = 256 conf.lr = 1e-1 conf.milestones = [4,6,7] That's the paras i used. According available repos and issues, i guess the performance might be slightly better if larger number of training epochs were applied on faces_emore.
@cvtower , Thanks a lot!
UPDATE acc for mobilefacenet: epoch 16 milestones = [8,12,14]
agedb_30_accuracy:95.90 cfp_fp_accuracy:92.10 lfw_accuracy:99.43 batch_size:256
@cvtower Do you train from scratch or use the pretrain model?
@cvtower Do you train from scratch or use the pretrain model?
Hello, From scratch.
@cvtower Do you train from scratch or use the pretrain model?
Hello, From scratch.
Thanks for your reply! I have trained the network using the parameters you noticed, but the accuracy of LFW always stays in 0.5. It's so strange.
@cvtower Do you train from scratch or use the pretrain model?
Hello, From scratch.
Thanks for your reply! I have trained the network using the parameters you noticed, but the accuracy of LFW always stays in 0.5. It's so strange.
Hello,
Please try beyondcompare and check you code, and make sure you use tensorboard correctly. According to my local log, mobilefacenet will obtain 97.8%+ accuracy on LFW after about 20k steps.
@cvtower Do you train from scratch or use the pretrain model?
Hello, From scratch.
Thanks for your reply! I have trained the network using the parameters you noticed, but the accuracy of LFW always stays in 0.5. It's so strange.
Hello,
Please try beyondcompare and check you code, and make sure you use tensorboard correctly. According to my local log, mobilefacenet will obtain 97.8%+ accuracy on LFW after about 20k steps.
Thanks! I have check my code and I have found the reason! Because I have to transform the model to ncnn and ncnn doesn't support the l2_norm, I seperate the l2_norm from the network. But I forget to add the l2_norm during inference, so the test result looks strange.
@cvtower By the way, do you get a better result using other training parameters?
@cvtower By the way, do you get a better result using other training parameters?
No, i just use this repo as a baseline version to verify my designed network on face recognition task, and i guess the paper will be published later.
@cvtower By the way, do you get a better result using other training parameters?
No, i just use this repo as a baseline version to verify my designed network on face recognition task, and i guess the paper will be published later.
OK, thank you!
@cvtower Hi cvtower, thanks for your shared params!
UPDATE acc for mobilefacenet: epoch 16 milestones = [8,12,14]
agedb_30_accuracy:95.90 cfp_fp_accuracy:92.10 lfw_accuracy:99.43 batch_size:256
I used the same paras as yours, but got 99.31@LFW and 90.5@CFP_FP, could you please share your trained model with me? Thanks!
@cvtower Hi cvtower, thanks for your shared params!
UPDATE acc for mobilefacenet: epoch 16 milestones = [8,12,14] agedb_30_accuracy:95.90 cfp_fp_accuracy:92.10 lfw_accuracy:99.43 batch_size:256
I used the same paras as yours, but got 99.31@LFW and 90.5@CFP_FP, could you please share your trained model with me? Thanks!
Hi, I also used the same paras, and I just got 92.xx% with two GPUs, how many GPU did you use? would you mind sharing your trained model?Thanks a lot!
@cvtower Hi cvtower, thanks for your shared params!
UPDATE acc for mobilefacenet: epoch 16 milestones = [8,12,14] agedb_30_accuracy:95.90 cfp_fp_accuracy:92.10 lfw_accuracy:99.43 batch_size:256
I used the same paras as yours, but got 99.31@LFW and 90.5@CFP_FP, could you please share your trained model with me? Thanks!
https://github.com/cvtower/seesawfacenet_pytorch I had uploaded my mobilefacenet pretrained model here in this repo.
@cvtower Hi cvtower, thanks for your shared params!
UPDATE acc for mobilefacenet: epoch 16 milestones = [8,12,14] agedb_30_accuracy:95.90 cfp_fp_accuracy:92.10 lfw_accuracy:99.43 batch_size:256
I used the same paras as yours, but got 99.31@LFW and 90.5@CFP_FP, could you please share your trained model with me? Thanks!
Hi, I also used the same paras, and I just got 92.xx% with two GPUs, how many GPU did you use? would you mind sharing your trained model?Thanks a lot!
https://github.com/cvtower/seesawfacenet_pytorch I had uploaded my mobilefacenet pretrained model here in this repo.
@cvtower Thanks a lot, but why the acc of model_2019-05-19-16-47_accuracy_0.9158571428571429_step_712992_final.pth is 0.5 when I ran the evaluation code?
Hi @TreB1eN ,
I found this line in the config.py file: conf.milestones = [3,4,5] # mobildefacenet but the milestones is not used during training, i guess that means learning rate is not decay. Learner.py +line 225: seems like self.schedule_lr() should be called during training according to the paper for mobilefacenet.
BTW, would you please share the training paras for mobilefacenet to reproduce your acc? batch size,init learning rate.
Thanks for your help!