Open SaadShakeel414 opened 9 months ago
We use SGD optimizer with 0.9 momentum, training iresnet100 for 20 epochs with 1 epoch warm-up, where maximum learning rate is 0.025 and the weight decay is 5e-4. The learning rate decreases by a factor of 10 in epoch 8, 12, and 16.
The hyperparameters for face losses, i.e. arcface, cosface, and magface, are set as default infered by thier original papers, say s=64 and m=0.5 for arcface.
Hi! Can you please share the training parameters for training iresnet100 models on the MS1MV2 dataset (with only 10% of the masked faces)? I tried training it with CosFace loss and ArcFace Loss with different parameter settings but facing convergence issues.
Thanks in advance.