MobileNetV3 Large Training All From Scratch

leondgarse / Keras_insightface

Insightface Keras implementation

MIT License

230 stars 56 forks source link

MobileNetV3 Large Training All From Scratch #109

Closed whysetiawan closed 1 year ago

whysetiawan commented 1 year ago

Hi Thanks for providing this tutorial All from scratch. I have been following step by step and using Casia WebFace Datasets and the training progress increasing so slow.

Here is my accuracy result

First i was using LR: 0.0001 and change to 0.001 at 25 epochs in order to improve accuracy. It didn't help, i wonder if you have advice

whysetiawan commented 1 year ago

My Source code is here version 16 (before i made changes)

leondgarse commented 1 year ago

Just took a quick scan, great job taking it to kaggle!

You may try a smaller scale=16.0 for arcface as a start, larger one may hard to converge.
For SGD, you may start with a larger lr like 0.1.
Another point is that, in my very early tests, weight_decay is crucial for arcface loss. You may add l2_regularizer to model, or use optimizers like tfa.optimizers.SGDW or tfa.optimizers.AdamW.

May try some tests myself if got some time, but not sure...

whysetiawan commented 1 year ago

Thanks for your response. I'll make some changes and will be back if face another problem. Just a question, how many epochs does it usually take to train from scratch? I read Training Scripts and your result is always high at the first epochs.

leondgarse commented 1 year ago

Ya, For MS1MV3 dataset, it should give some reasonable result after first expoch. Not sure for CASIA datasets, as it's much smaller, but shouldn't be larger than 3 epochs.