Open ddxu opened 7 years ago
You may refer to SphereFace paper's Figure 1 to learn about what's the difference between classification and metric learning. Then you can read SphereFace's theory part and NormFace's metric learning part to understand under which condition, we can use classification loss functions to do metric learning tasks.
To use triplet loss, you need to implement a hard sampling algorithm to avoid the zero gradient problem. It is difficult and tricky. Now the academic trend is to modify classification loss functions for metric learning tasks.
@happynear Thank you for your suggestions. I will read these papers carefully and experiment their code.
But "each identity has only two images",should we consider few-shot learning?
@happynear Admire your genius ! It seems that you are very experienced in face identification/verification problems. My problem is that: how to finetune those excellent models on card-face dataset, which has many identities (>100 thousands) while each identity has only two images (one is card face and the other is camera face). Obviously, my task is a verification task.
If finetuning such card-face dataset using softmax loss (or center loss, or their variants) which optimized as a classification problem, I guess the last inner product layer will have a large output number ( equal to total number of identities), it will hard to learn for the network , for the reason of the fact that the number of training samples are just double of the number of identities. Even your latest work "NormFace" is also seems to be learning in classification-style.
Now my finetune work is almost based on triplet-loss. It works on some issues but still have some problems. I want to try some new methods, but don't know which way to try.
Can you give me any suggestions? Thank you~