Hi,
After each epoch, I print the kernel variable of the ArcFace Module, but its weights are always the same although the gradients are not zero.
Should not the model head be set to train() when training ?
I have met the same problem, did you solve it?
My training loss only decress in first step, after that, it alway keep the same value like this:
tensor(34.6456, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
tensor(3.6976, device='cuda:0', grad_fn=)
If you has any suggestion, plz tell me , I'll be grateful
Hi, After each epoch, I print the kernel variable of the ArcFace Module, but its weights are always the same although the gradients are not zero. Should not the model head be set to train() when training ?