Open zeakey opened 6 years ago
It is exactly the uncertain factor you mentioned. There may be some problems in your downloaded cropped dataset, and there may be some mis-matches between the cropped dataset and the original LFW dataset. You need to follow exactly the same pipeline with ours.
Besides that, you might also want to retrain the network multiple times to see whether it is due to bad luck.
As for the 99.27% for the pretrained model, I have explained it in #93. We can successfully obtain 99.3%.
@wy1iu Thanks for your reply. So how many GPUs do you use in your training? This is related to the effective batch-size during training.
From you guidance ./code/sphereface_train.sh 0,1
and default hyper-parameters, I use two GPUs with 256 batch-size for each. Then the effective batch-size is 256x2.
I've posted my system environment in #93 , which may help us figure out the performance gap.
The detailed setting is available in the training log we released. The provided models are trained using the exact same setting in the repository. As for some other unpredictable issues like version of caffe, cuda or cudnn, I am not sure how they will affect the training.
I retrained the SpherefaceNet-20 from scratch but the performance on LFW cannt reach 99.30%. Evaluation log is as below:
My training enrironments are:
All other seetings are kept the same with your default configurations prototx files, the only thing that may be not clear is the number of GPUs
There are two suspicious factors that may cause the failure:
Here is my training log http://data.kaiz.xyz/log/retrain_sphereface_June2-2018.log.
Additionally, with the released pretrain model
sphereface_model.caffemodel
I only obtain the average accuracy of 99.27. This may be a minor problem and it has been mentioned in https://github.com/wy1iu/sphereface/issues/93.