DanJun6737 / TransFace

[ICCV 2023] TransFace: Calibrating Transformer Training for Face Recognition from a Data-Centric Perspective
52 stars 7 forks source link

The Train log #10

Closed 1162141320 closed 3 months ago

1162141320 commented 4 months ago

Can you provide your train log?The loss in the early stages of training are much higher than normal Vit

DanJun6737 commented 3 months ago

Hi @1162141320 ~ Sorry, some training logs have been lost. Both the DPAP strategy and the EHSM strategy can cause the loss of the model to be higher in the early stages of training compared to a normal ViT model, but it will quickly decrease over time. Additionally, our configuration files are based on 8 V100 GPUs, so if a different number of GPUs is used, the corresponding configuration parameters need to be adjusted accordingly.