Open Shahnawazgrewal opened 6 years ago
@Shahnawazgrewal can you please share with us how you've trained the checkpoint? and can you also share with us the checkpoint?
I trained with VGGFace2 as well, the model is not as good as MS Cele. Although the accuracy might be equal or better than MS, the TPR at 0.001 FPR is much lower (98.X compared to 99.X)
I eventually combined this two datasets and reached 99.73% accuracy and 99.63% TPR w/ 0.001 FPR.
@Shahnawazgrewal could U please kindly share with us how do U train the VGGFace2 model? When we try to train the data, we always got error as below. Thanks! OutOfRangeError (see above for traceback): FIFOQueue '_1_batch_join/fifo_queue' is closed and has insufficient elements (requested 9, current size 0) [[Node: batch_join = QueueDequeueUpToV2[component_types=[DT_FLOAT, DT_INT64], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch_join/fifo_queue, _arg_batch_size_0_0)]]
@Shahnawazgrewal For sure, if U can share your model via Google Drive and etc., that is also much appreciated. Thanks!
@Shahnawazgrewal I add center loss to caffe, I met the error "math_functions.cu:155] Check failed: error == cudaSuccess (11 vs. 0) invalid argument" when I train,have you met this problem or you know how to fix it? thx
can you please decrease maximum number of epochs. Please read issue #105 I have similar error for MS-Celeb-1M dataset. @syy6
For sure, I will share the model with you guys. @syy6
Did you train on a subset of MS-Celeb-1M. @JianbangZ
@Shahnawazgrewal my subset of MS-Celeb-1M is 70k identities, 4.5 million images. I can achieve 99.5% accuracy and 99.3% TPR with it
@Shahnawazgrewal , actually I even tried to reduce the number of epochs, but the issue still exists......
@Shahnawazgrewal it would be very nice if you could share the hyperparameters you used in training. I've recently tried to use the VGGFace2 to train by triplet loss but with no luck. The LFW accuracy and validation rate just levelled off at around 0.96 and 0.7.
@syy6 Did you check this out before? #600
@yipsang @Shahnawazgrewal , I just find the issue, one of the input png is broken in my computer, so I got this error. After removing the png, it seems to be fine now.
@JianbangZ, could U please share with us how U take the duplicate between MS & VGG2 dataset? If U look at the namelist of two datasets, certain names are duplicate in both dataset.
@Shahnawazgrewal Dude... where have you uploaded your model?
Here is the link to download a pre-trained model trained with inception-ResNet-v1 with center loss function on VGGFace2 dataset. Please give your general feedback.
@Shahnawazgrewal , Did you perform the pre-training on MS-Celeb-1M and then fine-tune on VGGFace2 dataset?
No. I didn't I trained with inception-ResNet-v1 with center loss function on VGGFace2 dataset from scratch. More specifically I downloaded loosely cropped faces dataset from the VG-GFace2 (http://www.robots.ox.ac.uk/~vgg/data/vgg_face2/ . The dataset is aligned with 160×160 image size and 32 pixels margin based on Multi-task CNN. I trained the model on aligned dataset for 100 epochs with an RMSProp optimizer. @Yeongjae
@Shahnawazgrewal could you make proper comments for checkpoint files which has been uploaded in dropbox!
@Shahnawazgrewal based on our evaluation, your model is truly powerful than both provided pretrain-model. I wonder the reason behind this wonderful improvement? Is the dataset used for training the root cause?
P.S. Thank your for uploading this wonderful pretrained checkpoint
@tenggyut , VGGFace2 dataset is considered to be a deep dataset (higher number of image per identity). In my opinion, this could be the reason. In addition, I observed that the model trained on VGGFace2 produced better representation of previously unseen faces.
@Shahnawazgrewal did you train the model as a classifier or using triple loss?
I trained the model based on center loss. @tenggyut
@Shahnawazgrewal
I have few questions about your implement detail:
Did you train the model with softmax loss combined with center loss? or you just train it with center loss?
combined.
Validation on LFW dataset with the model trained on VGGFace2 Runnning forward pass on LFW images Model directory: /home/super/datasets/lfw/vggface2-cl Metagraph file: model-20171216-232945.meta Checkpoint file: model-20171216-232945.ckpt-100000 Runnning forward pass on LFW images Accuracy: 0.992+-0.004 Validation rate: 0.96000+-0.01880 @ FAR=0.00067 Area Under Curve (AUC): 0.999 Equal Error Rate (EER): 0.008
I trained with cosine Face algorithms . accuracy is 0.995, validation rate = 0.985
@JianbangZ what do you mean cosine face algorithms? did you replace the L2 norm in center loss with cosine similarity? Or you mean the paper the author released last year, SphereFace?
@JianbangZ Have you tried ArcFace?Can you share your model?
@Shahnawazgrewal Q1: Did you modify learning rate? if yes, can you share your modify value? Q2: Did you modify param before training step, ex: weight_decay, center_loss factor, center_loss alpha..etc, if yes , can you share this? thanks for your helping.
@Shahnawazgrewal How many epochs do the converged model takes ? What the Loss/RegLoss are after the model converges on the vgg dataset ? Hoping to get your advices. Thank you!
I use the default settings from the facenet implementation. @akimo12345
I trained model for 100
epoches. @yemenr
@ShahnawazgrewalI am facing this issue while using this pre-trained model of VGGFace2:-
/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float
to np.floating
is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type
.
from ._conv import register_converters as _register_converters
Loading model...
2018-05-21 12:46:43.698367: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698396: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698400: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698404: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2018-05-21 12:46:43.698422: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Model loaded
Loading MTCNN Face detection model
MTCNN Model loaded
[INFO] camera sensor warming up...
Traceback (most recent call last):
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1039, in _do_call
return fn(*args)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1021, in _run_fn
status, run_metadata)
File "/usr/lib/python3.5/contextlib.py", line 66, in exit
next(self.gen)
File "/home/anju/.virtualenvs/dl4cv2/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: Attempting to use uninitialized value InceptionResnetV1/Conv2d_1a_3x3/weights
[[Node: InceptionResnetV1/Conv2d_1a_3x3/weights/read = IdentityT=DT_FLOAT, _class=["loc:@InceptionResnetV1/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "main.py", line 153, in
Caused by op 'InceptionResnetV1/Conv2d_1a_3x3/weights/read', defined at:
File "main.py", line 151, in
FailedPreconditionError (see above for traceback): Attempting to use uninitialized value InceptionResnetV1/Conv2d_1a_3x3/weights [[Node: InceptionResnetV1/Conv2d_1a_3x3/weights/read = IdentityT=DT_FLOAT, _class=["loc:@InceptionResnetV1/Conv2d_1a_3x3/weights"], _device="/job:localhost/replica:0/task:0/cpu:0"]]
please help to solve this.
@Shahnawazgrewal You model is powerful to my problem. Can you share the code which you used for training the model on VGGFace2 ? I want to fine tuning you model. Thank you very much !
I used the same code train_softmax.py
with default parameters.
@Shahnawazgrewal Thank you very much !
@Shahnawazgrewal When you train you model on VGGFace2,did you prefilter the dataset ?
No. It is pretty clean dataset.
@yipsang I tried to run a train_tripletloss.py for training the VGG dataset, but the program is crashed when it was saving a checkpoint model. How can you train a VGG dataset by using triplet loss? Is there any changes should I make?
@thuoctran I meet the same problem. You should modify train_tripletloss.py line 175 saver.restore(sess,os.path.expanduser(args.pretrained_model))
to these```
ckpt = tf.train.get_checkpoint_state(os.path.expanduser(args.pretrained_model))
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
@Shahnawazgrewal Did not you modify the parameter center_loss_factor,when you train your model? I find that the default value is 0.0. Did you used the value 0.0 train your model ?
I used 1e-2. @Laviyy
Thank you.
@JianbangZ You trained model with the combined dataset of VGGFace2 and MS Cele. I want to know how to combined the two datasets.? Which algorithm did you use, when you train your model ? Triplet loss, softmax loss, center loss or others? Can you share your model with us? Thank any way!
@Shahnawazgrewal First of all, thank you for your help. I trained my model according your direction, but it not good. I want to know the value of margin when you crop image with mtcnn. I find some faces are still slant. Did you used affine transformation to rotate the faces to upright faces?
@Laviyy margin is 30 ,this is good
@rain2008204 ok, thank you!
I trained a model on VGGFace2 using center loss. The embedding is powerful than the subset of MS-Celeb. I can make the model public with the two available mode. @davidsandberg