tamerthamoqa / facenet-pytorch-glint360k

A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.
MIT License
229 stars 60 forks source link

Embeddings are getting clustered together in a small region after training #20

Open Nrohlable opened 1 year ago

Nrohlable commented 1 year ago

Hi @tamerthamoqa,

Thanks a lot for such a fantastic repo which we could use for our work. I was recently working on building a Face verification system using Siamese network. I was using results of pretrained models of Casia webface dataset and VGG2 Face dataset and was able to achieve close to 90% accuracy on my dataset. Further I was using Hard triplet batching sample and training strategy to further fine tune the network but for some reason after training the Embeddings for all the images are being clustered together or in other words the distance of two embeddings corresponding to two persons are getting too close to each other for example earlier using the pre-trained models if for two embedding we were getting 0.45 as cosine distance after training using this triplet loss we were getting 0.006 and it doesn't change much for same person or different person.

If you could give me any insights on this, that would be helpful. Thanks

tamerthamoqa commented 1 year ago

Hello @Nrohlable

I am assuming you used Triplet loss, which optimizes the embedding space for Euclidean Distance and not Cosine Distance. Does using Euclidean Distance instead of Cosine Distance also have similar results?

Nrohlable commented 1 year ago

@tamerthamoqa thanks for responding.

Yes even if we use Euclidean distance the story doesn't changes much in that case as well Earlier for one pair we were getting around 1.29 and afterwards for the same pair it was 0.07288.

Also earlier, with the pre-trained models the overlap of euclidean distance btw same person image and different person image was around 8% and after after this training it went to around 81%.

Is there any specific norm which I used maintain while training this kind of network, Like this network has to be trained for lets say 200 epochs in order get some valid result or something.