Why can't my training loss decrease with ArcFace? It drops to about 21 and stops falling.

fdbtrs / PocketNet

Official repository for PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

Other

58 stars 10 forks source link

Why can't my training loss decrease with ArcFace? It drops to about 21 and stops falling. #3

Open LHH20000923 opened 2 years ago

fdbtrs commented 2 years ago

which database do you use for training, loss function, batch size etc...?

LHH20000923 commented 2 years ago

I use CASIA-WebFace, optim is SGD ,lr=0.1, batchsize=256/512, and didn't use knowledge distillation。 The loss curve is shown below。Is it a problem with the data set？

fdbtrs commented 2 years ago

For casia dataset, you may need to train the model for at least 50 epochs with lr step of 20 30 and 40. The reported FR result in the paper is based on MS1Mv2