davidsandberg / facenet

Face recognition using Tensorflow
MIT License
13.71k stars 4.8k forks source link

poor performance on Asian faces #591

Open tenggyut opened 6 years ago

tenggyut commented 6 years ago

I'm try to compute the cosine similarity of a couple of Asian faces, but it seems facenet can be confused by these faces (two total different faces can have 0.7 or higher cosine similarity). Is this normal or I am missing something ?

If I train facenet on a Asian Face dataset, will it help this problem?

Currently I only have a gtx1050 at hand, will it be sufficient to train the facenet?

Thanks

angushh commented 6 years ago

I think you can use the Asian Face dataset to fine-tune the models, and the result will be better.

lingorX commented 6 years ago

I'm also working on asian face recently,have you solved the problem by training on asian dataset,or solved it in other ways? thank you very much,present poor accuracy worries me

angushh commented 6 years ago

@0liliulei I didn't use asian dataset to train the model for the lack of asian data, I just use about 400 classes asian dataset to fine-tune the model provided by Davidsandberg. I didn't do many tests yet, but it seems that the new model's performance is better than the old one on asian faces. I think if you have enough asian data and you can train the model by your dataset, or just fine-tune the premodel. The dataset really has a great influence on the model. I hope that if you get some great results, please let me know.

lingorX commented 6 years ago

@angushh Thank you for your prompt reply.I am now applying for an Asian face dataset.If I get substantial progress through this dataset,I will @ you in time.

yanmenglu commented 6 years ago

@0liliulei how to fine tune the faceNet ,could you share the way for me.Thank you very much.

tenggyut commented 6 years ago

We are also trying to fine-tunning the pretrained model on a asian face dataset, maybe we can share the dataset to get more progress on this matter? Sounds good?

angushh commented 6 years ago

how about a QQ group or a weixin group?

lingorX commented 6 years ago

@tenggyut how to contact you?

lingorX commented 6 years ago

@yanmenglu just to continue training on the model davidsandberg provided with asian dataset

Zumbalamambo commented 6 years ago

How do we retrain with asian faces?

getengqing commented 6 years ago

@0liliulei I also want to train on asian dataset, but I have no idea, can you share your idea?Thanks!

xrf116 commented 6 years ago

@ALL Someone can share pretrained model on a asian face dataset?

tenggyut commented 6 years ago

I think the true issue is the quality and quantity of the dataset, which means the dataset should contains enough images (millions) and enough variance(age, pose, lighting etc..). The Performance of LFW is just a start, and far from real application. This paper really reveal some interesting issue about the LFW benchmark.

Linzaer commented 6 years ago

@angushh @0liliulei Hi,I have also used about 7000 classes(23500 imgs) asian dataset to fine-tune the pretrained model(MS-celeb-1M dataset) provided by Davidsandberg to improve the performence on asian faces.How about your lfw Accuracy? I got a low lfw accuracy about 93.7% after 15 epochs lower than David's pretrained model and the trend is not optimistic.May be something wrong?Here is some result below:

  1. command line arguments: --logs_base_dir ~/logs/pretrained_facenet/ --models_base_dir ~/models/pretrained/ --data_dir /home/linzai/PycharmProjects/facenet/data/mydataset160 --image_size 160 --model_def models.inception_resnet_v1 --lfw_dir /home/linzai/PycharmProjects/facenet/data/lfw_mtcnnpy_160 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 160 --keep_probability 0.8 --random_crop --random_flip --learning_rate_schedule_file data/learning_rate_schedule_classifier_mydataset.txt --weight_decay 5e-5 --center_loss_factor 1e-2 --center_loss_alfa 0.9 --learning_rate_schedule_file /home/linzai/PycharmProjects/facenet/data/learning_rate_schedule_classifier_mydataset.txt --pretrained_model /home/linzai/PycharmProjects/facenet/src/20170512-110547/model-20170512-110547.ckpt-250000

2.fine-tune on learning rate and write in learning_rate_schedule_classifier_mydataset.txt:

0: 0.1 30: 0.01 50: 0.001 110: 0.0001

3.lfw_result.txt:

0 0.95467 0.77300 1000 0.94717 0.65133 2000 0.94700 0.70333 3000 0.94700 0.63900 4000 0.94317 0.64167 5000 0.94133 0.61067 6000 0.93900 0.58033 7000 0.93633 0.59700 8000 0.94183 0.62500 9000 0.93867 0.59000 10000 0.93683 0.59267 11000 0.93533 0.58333 12000 0.92650 0.49200 13000 0.91650 0.50700 14000 0.93567 0.61233 15000 0.93767 0.60233

Can someone give me some advice?THX!

lingorX commented 6 years ago

@Linzai1994 I think,first,though your dataset and MS dataset are all people faces,but your dataset maybe have absolutely different quantity with MS,to be precise,your faces' styles are different from MS's.although MS have Asian faces,but most of them are Caucasian faces,so it can work well on lfw. second,as you have fine-turn the model with Asian dataset,this model will show preference to Asian face,and it will loss initial keen eye to Caucasian faces,because they have different styles.So I think you need to test or validate on Asian dataset. third,if after you test on Asian dataset,but it still performance not well.I just want to say it's normal,I think if we want to get a 99% accuracy model on any Asian dateset,we need billions of samples.otherwise,we should make sure that training set and test set are as similar as possible.what I mean is we just train a model which only can be use in a small scale. finally, lfw accuracy prove nothing if you want to build a model which can be put into practice.

Linzaer commented 6 years ago

@0liliulei Thank for your reply! 谢谢~ I see what you mean.The dataset determines what the model outputs.I used the model which fine-tuned by my dataset in real scene camera and the result is still not satisfactory.I think the comparison result also rely heavily on the train data bias, cross factors,e.g. angle,light,age difference and image quality.

harryxu-yscz commented 6 years ago

@Linzai1994 Would you mind sharing your dataset or where you got it from? Thanks!

Linzaer commented 6 years ago

@harryxu-yscz Sorry,it's a private dataset from company .I’m afraid I can't share with you~

look4pritam commented 6 years ago

You can have one more solution, you can add images of people from your organization to large dataset like Ms dataset and train for complete set. In this way you will have accuracy for your dataset as well as for lfw dataset.

Joker316701882 commented 6 years ago

@tenggyut @angushh @0liliulei @yanmenglu @Zumbalamambo @Linzai1994 Hello everyone! I have same project with you guys, training model on Asian faces. Validation set is from customer(Asian faces). Use model provided by sandberg, the validation acc is about 75% at the begining. After finetuning on Asian faces, validation Acc on asian faces reaches 92%. And of course, the performance on lfw will reduced a little (not sure how much though)...

Training set : 10,322 ids, around 350,000 images, collected from Web Validation set : from customer

About how to keep performance on lfw while generalize better on other private set (like Asian faces), this paper may help: https://arxiv.org/abs/1606.09282

I'm currently woring on implementing this paper "Additive margin softmax(AM-softmax)": https://arxiv.org/abs/1801.05599 With all configuration same as author's, I only achieved 98.3% on lfw, only difference is author use caffe while I use tensorflow. In paper, author said Mom is better than Adam, but Mom for me only reaches 94.3% while Adam can easily reach 98%. My hypothesis is the difference of implementing details of framework(not so sure, but other parts are exactly same with author's). Would be glad if someone can discuss with me.

taewookim commented 6 years ago

@Joker316701882

Are you releasing this model with source ?

look4pritam commented 6 years ago

I have just started it. I have got one model trained for Microsoft 20K identities, and working on Microsoft 100K. Microsoft 100K dataset is divided into 20 equal parts. Each part I am training with microsoft 20K pretrained model. Currently only 2 parts are done. Once all parts are done, I will make the model available.

On Mar 17, 2018 12:38, "TaeWoo Kim" notifications@github.com wrote:

@Joker316701882 https://github.com/joker316701882

Are you releasing this model with source ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidsandberg/facenet/issues/591#issuecomment-373899856, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLq4mK95dyYxiFlbTdqdko_Tl0CVYLDks5tfLZdgaJpZM4RMidR .

taewookim commented 6 years ago

Is there any public dataset specifically by race ? what are you guys using? (yes I know.. i can scrape.. but i'm curious if anyone has already built such)

@look4pritam look forward to seeing it. BTW.. what dataset is that? What's the URL for it?

look4pritam commented 6 years ago

Check for Microsoft 1 million challenge dataset. You will get 5 links 20GB 4 files and 4GB one file, which were released last year by Microsoft as two challenges.

On Mar 17, 2018 16:12, "TaeWoo Kim" notifications@github.com wrote:

@look4pritam https://github.com/look4pritam look forward to seeing it. BTW.. what dataset is that? What's the URL for it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/davidsandberg/facenet/issues/591#issuecomment-373910684, or mute the thread https://github.com/notifications/unsubscribe-auth/AFLq4uNrixbRgLCQgNlkaMyM82bo_xBfks5tfOiMgaJpZM4RMidR .

taewookim commented 6 years ago

Thanks @look4pritam

Any idea when your model will get released?

nellycui commented 6 years ago

Hi @Joker316701882 Did you train the model with softmax_loss or triplet_loss? If it is possible, can you provide the parameters of the training at that time?

Thank you in advance!

Joker316701882 commented 6 years ago

@taewookim @nellycui Sorry can't release model. But I do implemented training logic of 'Additive Margin Softmax (or CosFace by Tencent)'. It works extremely better than any loss function in this repo. Here is source code: https://github.com/Joker316701882/Additive-Margin-Softmax

speculaas commented 6 years ago

Dear All, About fine-tuning, is there any particular strategy?

Do you update every trainable variables? My rough idea currently is, maybe just fine-tuning some higher level weights would be good enough? Maybe like tensorflow-for-poets?

https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#0

My reasoning is that the first few layers of David's pretrained model is probably already good at recognizing basic visual elements such as oriented edges and opposing colors etc.

Maybe we only need to fine tune upper layers' weight?

taewookim commented 6 years ago

Does anyone have any updates? Seems like a pretty hot topic

@look4pritam Any update on the release?

cftang0827 commented 6 years ago

@angushh, thank you for your information, did you use the train_softmax.py and ckpt file, keep training to fine tune the pretrained model?

If so, did you face the problem that the output category of classifier was not match with Daivd’s retrained model?

Here’s the way I did, i just changed the size of “train_set “ to match original category of pretrained model(2017*****)

For example, i used 500 different people with 1000 images, so the output size of softmax FC layser is 500. However, because the original FC output is 4xxxx , so I need to forcely change my train_set size to 4xxxx to match the pretrained model.

Is that correct?

Thank you for your info.

cftang0827 commented 6 years ago

Besides, is there any one cosider the problem of face alignment? Because in my test, the distance of two faces from same person is smaller when I operate face alignment, is there any other ideas?

Thanks

tenggyut commented 6 years ago

Based on our experiments, face alignment plays a very crucial role in Face Recognition task. the "traditional way" is landmark-based affine transformation. a more recent approach is use CNN to learn a robust way to align faces.

these two papers may be helpful:

Currently we are trying to do some experiments based on these two papers. Will update the result If any luck

cftang0827 commented 6 years ago

Thank you @tenggyut. I will take a look. Besides, is there any issues about the image quality of face? I means can we use some filter to "correct" the image and "denoising"?

Or is there any way to get face in image "only", due to the interfere and noise from background?

Thanks.

hungnv21292 commented 6 years ago

Dear @Joker316701882 , Could you please share the name of Asian face dataset you use to finetune? Thanks you so much.

Joker316701882 commented 6 years ago

@viethungtsdv Sorry, it's private. But there is another open Asian dataset which is huge enough to finetune. Check here: http://trillionpairs.deepglint.com/overview

hungnv21292 commented 6 years ago

Dear @Joker316701882,

Thanks for your support. I think that the Asian-Celeb datasets enough to train finetune. I will try finetune it and share with you later. Thanks

hungnv21292 commented 6 years ago

hi @Joker316701882 In this repo, @davidsandberg use LWF with LWF dataset and pairs.txt file. How to evaluate with your customer data (have use pairs.txt file, ...) when you fine-tune Asian-face dataset? I donn't know how to evaluate for new dataset? Thank you

hungnv21292 commented 6 years ago

hi @Joker316701882 I have concern about Asian-Celeb datasets. This datasets is clean or not. this datasets have overlap and noise with other datasets (LWF, VGGFACE, ...) In your finetune, have you pre-processing (clean noise, overlap) with others datasets?

Thanks in advance.

test4fest commented 5 years ago

@Joker316701882

learning without forgetting seems to be very interesting. Thanks for letting us to know. I am curious to know, did you get good results by implementing on asian datasets from deeplight ataset?

zfs1993 commented 5 years ago

@angushh @0liliulei Hi,I have also used about 7000 classes(23500 imgs) asian dataset to fine-tune the pretrained model(MS-celeb-1M dataset) provided by Davidsandberg to improve the performence on asian faces.How about your lfw Accuracy? I got a low lfw accuracy about 93.7% after 15 epochs lower than David's pretrained model and the trend is not optimistic.May be something wrong?Here is some result below:

  1. command line arguments: --logs_base_dir ~/logs/pretrained_facenet/ --models_base_dir ~/models/pretrained/ --data_dir /home/linzai/PycharmProjects/facenet/data/mydataset160 --image_size 160 --model_def models.inception_resnet_v1 --lfw_dir /home/linzai/PycharmProjects/facenet/data/lfw_mtcnnpy_160 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 160 --keep_probability 0.8 --random_crop --random_flip --learning_rate_schedule_file data/learning_rate_schedule_classifier_mydataset.txt --weight_decay 5e-5 --center_loss_factor 1e-2 --center_loss_alfa 0.9 --learning_rate_schedule_file /home/linzai/PycharmProjects/facenet/data/learning_rate_schedule_classifier_mydataset.txt --pretrained_model /home/linzai/PycharmProjects/facenet/src/20170512-110547/model-20170512-110547.ckpt-250000

2.fine-tune on learning rate and write in learning_rate_schedule_classifier_mydataset.txt:

0: 0.1 30: 0.01 50: 0.001 110: 0.0001

3.lfw_result.txt:

0 0.95467 0.77300 1000 0.94717 0.65133 2000 0.94700 0.70333 3000 0.94700 0.63900 4000 0.94317 0.64167 5000 0.94133 0.61067 6000 0.93900 0.58033 7000 0.93633 0.59700 8000 0.94183 0.62500 9000 0.93867 0.59000 10000 0.93683 0.59267 11000 0.93533 0.58333 12000 0.92650 0.49200 13000 0.91650 0.50700 14000 0.93567 0.61233 15000 0.93767 0.60233

Can someone give me some advice?THX!

hello,i want to know have you change any code about train_softmax.py? i follow these but still get errors,can you show me your steps? saver = tf.train.Saver(tf.trainable_variables(), max_to_keep=3) ckpt = tf.train.get_checkpoint_state(args.pretrained_model) print('Restoring pretrained model: %s' % args.pretrained_model) saver.restore(sess, ckpt.model_checkpoint_path)

error is following Traceback (most recent call last): File "/opt/app/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call return fn(*args) File "/opt/app/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn target_list, status, run_metadata) File "/opt/app/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [76] rhs shape= [78445] [[Node: save/Assign_900 = Assign[T=DT_FLOAT, _class=["loc:@Logits/biases"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Logits/biases, save/RestoreV2/_835)]] [[Node: save/RestoreV2/_1706 = _SendT=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_878_save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

amoderate commented 5 years ago

Hi - I completed a pipeline for retraining the model using a subset of the MS-celeb-1M celeb data set. My use case is for the asian market, so my intention is to overfit to asian faces and not worry about LFW score. The new model does seem to be better, with an improved accuracy from 63% to 92% on my company's internal data set. If it is helpful, I will push a repository with the model training pipeline, and our pre-trained model.

xlphs commented 5 years ago

@amoderate Hi, could you share some details of your pipeline? Specifically:

I plan to train from the last checkpoint with manually cleaned subset of the Asian-Celeb dataset.

bing1zhi2 commented 5 years ago

@zfs1993 you can fllow #139 . print all the variables ,and remove the last few layers t_variables = tf.trainable_variables() print("t_variables", t_variables)

var_to_restore = [v for v in t_variables if not v.name.startswith('InceptionResnetV1/Bottleneck')]

    var_to_restore2 = [v for v in t_variables if not v.name.startswith('Logits')]
zfs1993 commented 5 years ago

@zfs1993 you can fllow #139 . print all the variables ,and remove the last few layers t_variables = tf.trainable_variables() print("t_variables", t_variables)

var_to_restore = [v for v in t_variables if not v.name.startswith('InceptionResnetV1/Bottleneck')]

var_to_restore2 = [v for v in t_variables if not v.name.startswith('Logits')]

yes,i solve it several days ago,anyway thanks.

zfs1993 commented 5 years ago

I'm try to compute the cosine similarity of a couple of Asian faces, but it seems facenet can be confused by these faces (two total different faces can have 0.7 or higher cosine similarity). Is this normal or I am missing something ?

If I train facenet on a Asian Face dataset, will it help this problem?

Currently I only have a gtx1050 at hand, will it be sufficient to train the facenet?

Thanks

why dont you use Euclidean distance? is the cosine similarity perform better than it ?

look4pritam commented 5 years ago

Yes cosine similarly perform better.

arechapala commented 5 years ago

Cosine distance and euclidean distance are related once you have normalized the embeding, so they performe equally (with different thresholds) : De^2 = 2*Dc

chen-bh commented 5 years ago

@0liliulei Thank for your reply! 谢谢~ I see what you mean.The dataset determines what the model outputs.I used the model which fine-tuned by my dataset in real scene camera and the result is still not satisfactory.I think the comparison result also rely heavily on the train data bias, cross factors,e.g. angle,light,age difference and image quality.

老哥可以说一下你是怎么迁移的吗啊,或者给一些你看过的资料也行。我的准确率也相当感人