leondgarse / Keras_insightface

Insightface Keras implementation
MIT License
241 stars 56 forks source link

Triplet loss training #118

Open SaadSallam7 opened 1 year ago

SaadSallam7 commented 1 year ago

I was trying to train FaceNet on kaggle using TPU but I had some problem and I noticed that you have train with it before and have good results so can you help me, please? I used batch hard strategy with the code provided here -I compared it with your implementation they gave the same results so there's no problem in the implementation- I'm training with vggface2 dataset where I take 32 image per the person and a batch size of 1024 so the batch will contain 32 different persons each with 32 image. The problem is that there's no improving on the test set, accuracy and threshold are constants at 0.5, 0 even after 10 epochs. 269/269 [==============================] - ETA: 0s - loss: 1.0424

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.000000 Improved = 0.500000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_1_0.500000.h5 Epoch 2/50 269/269 [==============================] - ETA: 0s - loss: 1.0030

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_2_0.500000.h5 269/269 [==============================] - 191s 712ms/step - loss: 1.0030 Epoch 3/50 213/269 [======================>.......] - ETA: 5s - loss: 1.0015 lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_3_0.500000.h5 269/269 [==============================] - 190s 710ms/step - loss: 1.0015 Epoch 4/50 269/269 [==============================] - ETA: 0s - loss: 1.0012

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_4_0.500000.h5 269/269 [==============================] - 191s 712ms/step - loss: 1.0012 Epoch 5/50 269/269 [==============================] - ETA: 0s - loss: 1.0011

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_5_0.500000.h5 269/269 [==============================] - 192s 715ms/step - loss: 1.0011 Epoch 6/50 269/269 [==============================] - ETA: 0s - loss: 1.0011

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_6_0.500000.h5 269/269 [==============================] - 193s 718ms/step - loss: 1.0011 Epoch 7/50 269/269 [==============================] - ETA: 0s - loss: 1.0009

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_7_0.500000.h5 269/269 [==============================] - 193s 718ms/step - loss: 1.0009 Epoch 8/50 269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_8_0.500000.h5 269/269 [==============================] - 192s 717ms/step - loss: 1.0008 Epoch 9/50 269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_9_0.500000.h5 269/269 [==============================] - 193s 718ms/step - loss: 1.0008 Epoch 10/50 269/269 [==============================] - ETA: 0s - loss: 1.0008

lfw evaluation max accuracy: 0.500000, thresh: 0.000000, previous max accuracy: 0.500000 Improved = 0.000000 Saving model to: /kaggle/working/chekpoints_basic_lfw_epoch_10_0.500000.h5

This is the notebook if you can take a look. Thanks in advance.

leondgarse commented 1 year ago

I cannot see your notebook, telling No saved version. Generally, triplet loss should better used after some softmax or arcface training, as in the early stage of training, the model cannot mine a good positive / negative pair. May refer some related issue like MobileFacenet SE Train from scratch #9 or the result ResNet101V2 using nadam and finetuning with triplet.

SaadSallam7 commented 1 year ago

I'm sorry but you can open it now. Ok, I will train it with arcface then triplet loss but to be honest, I don't think this what makes the a problem as the accuracy is 50% indicates that the model isn't really learning it gives always true or always false! Last question please, how are you initializing the dataset for online mining? for me, when I read the dataset I read it sorted so the first 32 example are for one class and the second 32 example are for another class and so on so the batches are fixed while fitting the model but I think in the original paper they were sample batches randomly.

leondgarse commented 1 year ago