Open ffredd opened 2 years ago
In theory, difficult sample mining should be useful to further improve the model effect. In my vision, it is possible to first fine-tune the pre-trained model with a normal classification loss. When the model learns to a certain bottleneck, the triple loss is used for difficult sample learning. Of course, in my actual practice, I train both losses at the same time.
I use triple loss between data of two modalities to reduce the distance between different modalities of the same class and increase the distance between different modalities of different class. But when I use batch_all loss, the valid set loss has not changed; now using hard_loss, the valid set loss still has not changed. What is the reason? I found some answers that triplet is difficult to converge. What do you do to deal with triplet loss convergence?