batch_norm parameters issue

HuangYG123 / CurricularFace

CurricularFace(CVPR2020)

MIT License

526 stars 72 forks source link

Open konioyxgq opened 4 years ago

konioyxgq commented 4 years ago

On line 104 of train.py, you write "separate batch_norm parameters from others; do not do weight decay for batch_norm parameters to improve the generalizability"https://github.com/HuangYG123/CurricularFace/blob/8b2f47318117995aa05490c05b455b113489917e/train.py#L104 Is this the conclusion you got from the experiment? Or is the conclusion of a certain paper? Can you explain the reason for this?

HuangYG123 commented 4 years ago

konioyxgq commented 4 years ago

Okay thank you. What do you think is the reason for this?