How you using ArcMarginProduct.py as loss

wujiyang / Face_Pytorch

face recognition algorithms in pytorch framework, including arcface, cosface, sphereface and so on

Apache License 2.0

808 stars 156 forks source link

How you using ArcMarginProduct.py as loss #10

Open jaideep11061982 opened 5 years ago

jaideep11061982 commented 5 years ago

hi how are you using arcface as loss. I could only see Cross ENtropy loss in your implementation.

wujiyang commented 5 years ago

Arcface is implemented in margin/ArcMarginProduct.py

jaideep11061982 commented 5 years ago

thanks.. what is thought process behind passing output of above to CrossEntropy() output = margin(raw_logits, label) total_loss = criterion(output, label)

wujiyang commented 5 years ago

Well I think you didn't totally understand the principle of margin-based algorithms (sphereface, cosface, arcface). They all use Cross Entropy as loss function, the difference between them lies in the softmax operation, different algorithms adopt different margins for the final classification vector.

jaideep11061982 commented 5 years ago

in Arcface margin you are doing random init of self.weights . In every batch run how would self.weights get updated as they not part of your net like resnet,mobilnet etc. so every time you use only random weight you got first time during init.

for data in trainloader:
            img, label = data[0].to(device), data[1].to(device)
            optimizer_ft.zero_grad()

            raw_logits = net(img)
            output = margin(raw_logits, label)
            total_loss = criterion(output, label)
            total_loss.backward()

2) secondly BatchNorm should avoid the need for normalizing the in_features for last layer ? 3) how to decide on hyper param like s,th,m etc 4) does it takes many iteration for it to converge .. i started with loss as 15... it converging very slowly.

wfcb-85 commented 5 years ago

If you want the arcface weights to be updated after each backpropagation, you have to add them to the initialisation of your optimizer. Sometime like optmizer = optim.Adam(list(model.parameters()) + list(arcface.parameters(), lr = your_learning_rate)