alessiosavi / PyRecognizer

"A neural network to rule them all, a neural network to find them, a neural network to bring them all and verify if is you !!" (Face recognition tool)
MIT License
35 stars 14 forks source link

Appending user training images when needed #24

Open SaddamBInSyed opened 4 years ago

SaddamBInSyed commented 4 years ago

Hi @alessiosavi Thanks for your work.

I have 10 users that need to be enrolled in order to recognize them later. but I got only the 5 person faces and I zipped those and trained the same.

So Is it possible to add the remaining 5 more users to the existing already trained file. (pickle/dataset)?

please advise.

ClashLuke commented 4 years ago

In your particular case the model is prone to overfit, so it might not be a good idea to add new cases on the fly (like you'd do in online learning). Is it possible for you to first acquire one image of each class (in this case user) and then add more as you get more?

SaddamBInSyed commented 4 years ago

Thanks for your reply.

"Is it possible for you to first acquire one image of each class (in this case user) and then add more as you get more?"

No, actually the person/employee is in leave/absent when we collecting/enrolling all employee/user faces. Once the employee is back to the office then again we will go and collect the same.

So now we waiting to collect all the person images to train in one go. but I want to know How to (add/remove) train the new faces on the fly.

One more scenario:

If tomorrow some trained employee/user is terminated/resigned from the company then we have to remove his face from the dataset/pickle file as well.

please advise

ClashLuke commented 4 years ago

Those are two very interesting scenarios. The first one you mentioned, adding new training data, should be possible. If @alessiosavi can't add the feature right now I'll release a pull request by Sunday.

However, it comes with another challenge as well. If you have more people you want to detect, you also have more different classes. Unfortunately just adding more classes isn't really possible, unless we choose to use PyTorch (#2). Back to your problem: For now you have to provision how many different people you want to predict at a given time.

The second one, removing from the dataset and the predictions, is split in two parts as well. Removing them from the dataset is possible to be implemented right now. The second is blocked by issue #2 as well.

I hope that clears it a bit up and potentially solves your issue.

SaddamBInSyed commented 4 years ago

"However, it comes with another challenge as well. If you have more people you want to detect, you also have more different classes. Unfortunately just adding more classes isn't really possible, unless we choose to use PyTorch (#2)."

I am trying to understand the above statement but no luck.

Are you trying to say, we can not add more userimages to dataset (e;g adding 10K employee images and train them)

also why we need PyTorch here please clarify?

ClashLuke commented 4 years ago

If you tell the neural network to classify between ten people that works fine. Even if you only give it two people the first day and over time add new employees. However, as it puts all the faces in "buckets", and you only have ten buckets, you can't really tell it that someone goes into the 11th bucket, as there is no such thing.

That's where PyTorch comes into play. With PyTorch you can just create the 11th bucket or remove the 3rd. Got a new employee? Add them. One got sick? Remove them.

SaddamBInSyed commented 4 years ago

Thanks for your reply.

I dont think I understood. Here my view,

Total user to enroll = 100 so each folder( named as username_id) has at least 1 image. Now day 1 I trained the NN with available data and got the pickle file.

day 2 I again train the new user data and now I have second pickle file.

Now during KNN/MLP.train() process if I combine the above 2 pickle file ( contains dlib face encodings of all user images) then now my NN has knowledge of new users as well.

now I can call my new unknown image with predict function.

Am I right? if not where I am blank.

please advise.

ClashLuke commented 4 years ago

You're right, if you first create enough user_id's and then assign images to each that should work!\ However, you still can't predict more than 100 users and there might be a slight chance that a terminated employee is still falsely recognized. This should be fixed with #25.

I think I don't quite understand your pipeline though. Why do you want to combine the two pickle files? Doesn't the second already contain the images of the first?

SaddamBInSyed commented 4 years ago

Doesn't the second already contain the images of the first? o. 2 files are different ( 1st file = 1to50 user image and 2nd file = 51 to 100 user image).

ClashLuke commented 4 years ago

Then what stops you from generating a 2nd file with 1 to 100?

Regarding your point, pickles are binary files, so appending them may be difficult. However, we could create a new issue regarding the concatenation of datasets.

SaddamBInSyed commented 4 years ago

yes. it would be nice if we have provision to concatenate multiple pickle files.

thank you.

SaddamBInSyed commented 4 years ago

Hi, @alessiosavi @ClashLuke Is there any update regarding "adding/remove new user classes on the fly"?

please update.

alessiosavi commented 3 years ago

Hi @SaddamBInSyed , @ClashLuke

Sorry for the long time pause ...

What are you trying to do is not architectural possible.

During the training, the network will compute weights and bias for every layer of neuron. The last layer, contains N neuron where N is the number of the employer that you have to deal with. This layer (the last), is softmax, which means it has predefined number of neurons, each one is defined for one specific class. So, once you have trained the network for recognize N people, in order to recognize N+1 people the network have to be retrained from scratch, in order to update the weight, bias of each neuron.

One think that we can try is to:

In my knowledge, unfortunately this can't be done.

https://github.com/alessiosavi/PyRecognizer/issues/2#issuecomment-829194731

ClashLuke commented 3 years ago

It's most certainly possible to do this. You're right that the softmax distributions will be somewhat broken, and you will need to retrain the model, but you can keep all of the weights of the model.\ Simply initializing the new class with sufficiently small weights so that it doesn't screw up the distribution in known cases should suffice given more training.\ Another approach would be to simply attach a new classification head, as you mentioned. Lastly, perhaps the best way, would be to either train a BYOL or SimCLR model (depending on your resource constraints) on your own data or take a pretrained one and attach a new head.\ There are plenty of SimCLR models pretrained on ImageNet-22k, which should be more than strong enough for classifying faces.

alessiosavi commented 3 years ago

Another approach would be to simply attach a new classification head, as you mentioned

I think that I'll move in dist direction first, let's see what happen!

Another approach would be to simply attach a new classification head, as you mentioned. Lastly, perhaps the best way, would be to either train a BYOL or SimCLR model

Maybe is not the right path for this project. As mentioned here https://github.com/alessiosavi/PyRecognizer/issues/2#issuecomment-829194731, the tool will use an encoders in order to retrieve the face embedding. By this way, i can avoid to use complicated DCNN that work what an image. Instead, the NN will work with a mono dimensional array that contains 128 value (the landmark points of the face).

Using this method, I was able to get ~97% of accuracy without hyperparameter tuning.


I've just upload an example of the future core prediction functionality (~98% accuracy)
https://github.com/alessiosavi/tensorflow-face-recognition/blob/main/dense_embedding.py

ClashLuke commented 3 years ago

Congratulations!

I'm quite surprised that a two-layer dense model performs this well. How large is your dataset, and how well did a CNN perform? (I could imagine a basic ResNet-18 being significantly faster on GPU than a feature-extractor.)

alessiosavi commented 3 years ago

The dataset is the following one: http://vis-www.cs.umass.edu/lfw/#download

I've removed the person that don't meet the following constraint:

The total of people are 142

Celebrites list
George_W_Bush 430
Colin_Powell 205
Tony_Blair 126
Donald_Rumsfeld 107
Gerhard_Schroeder 92
Ariel_Sharon 69
Junichiro_Koizumi 53
Jean_Chretien 50
Hugo_Chavez 50
Jacques_Chirac 48
Serena_Williams 46
John_Ashcroft 42
Jennifer_Capriati 42
Vladimir_Putin 41
Lleyton_Hewitt 40
Gloria_Macapagal_Arroyo 39
Luiz_Inacio_Lula_da_Silva 37
Arnold_Schwarzenegger 33
Tom_Ridge 32
Laura_Bush 32
Andre_Agassi 32
Alejandro_Toledo 30
Guillermo_Coria 29
David_Beckham 29
Nestor_Kirchner 28
Kofi_Annan 28
Vicente_Fox 27
Silvio_Berlusconi 27
Roh_Moo-hyun 27
Ricardo_Lagos 27
Jack_Straw 27
Alvaro_Uribe 27
Hans_Blix 26
Megawati_Sukarnoputri 25
Mahmoud_Abbas 25
John_Negroponte 25
Bill_Clinton 25
Tom_Daschle 24
Juan_Carlos_Ferrero 24
Saddam_Hussein 23
Tiger_Woods 22
Recep_Tayyip_Erdogan 21
Lindsay_Davenport 21
Jose_Maria_Aznar 21
Jennifer_Aniston 21
Atal_Bihari_Vajpayee 21
Naomi_Watts 20
Jennifer_Lopez 20
Gray_Davis 20
Amelie_Mauresmo 20
Pete_Sampras 19
Jeremy_Greenstock 19
George_Robertson 19
Rudolph_Giuliani 18
Paul_Bremer 18
Jiang_Zemin 18
Hamid_Karzai 18
Carlos_Moya 18
Angelina_Jolie 18
Venus_Williams 17
Michael_Schumacher 17
Lance_Armstrong 17
John_Kerry 17
John_Howard 17
Carlos_Menem 17
Winona_Ryder 16
Tim_Henman 16
Spencer_Abraham 16
Richard_Myers 16
Nicole_Kidman 16
John_Bolton 16
Bill_Gates 16
Michael_Bloomberg 15
Julianne_Moore 15
Joschka_Fischer 15
John_Snow 15
Igor_Ivanov 15
Roger_Federer 14
Renee_Zellweger 14
Pervez_Musharraf 14
Norah_Jones 14
Meryl_Streep 14
Kim_Clijsters 14
Julie_Gerberding 14
Hu_Jintao 14
Fidel_Castro 14
Dominique_de_Villepin 14
Dick_Cheney 14
Britney_Spears 14
Andy_Roddick 14
Yoriko_Kawaguchi 13
Tommy_Franks 13
Salma_Hayek 13
Pierce_Brosnan 13
Jean_Charest 13
Halle_Berry 13
George_HW_Bush 13
Bill_Simon 13
Ari_Fleischer 13
Abdullah_Gul 13
Wen_Jiabao 12
Rubens_Barrichello 12
Queen_Elizabeth_II 12
Nancy_Pelosi 12
Mohammed_Al-Douri 12
Michael_Jackson 12
Joe_Lieberman 12
James_Blake 12
Hillary_Clinton 12
Edmund_Stoiber 12
David_Nalbandian 12
Anna_Kournikova 12
Trent_Lott 11
Sergio_Vieira_De_Mello 11
Sergey_Lavrov 11
Mahathir_Mohamad 11
Jiri_Novak 11
Jeb_Bush 11
James_Kelly 11
Jackie_Chan 11
Howard_Dean 11
Gordon_Brown 11
Eduardo_Duhalde 11
Tommy_Thompson 10
Tom_Hanks 10
Tom_Cruise 10
Tang_Jiaxuan 10
Richard_Gere 10
Richard_Gephardt 10
Paradorn_Srichaphan 10
Mike_Weir 10
Mark_Philippoussis 10
Lucio_Gutierrez 10
John_Paul_II 10
Ian_Thorpe 10
Harrison_Ford 10
Catherine_Zeta-Jones 10
Bill_McBride 10

I don't know about the performance/accuracy of the ResNet-18 for extract the embedding. Searching online I've seen that the dlib feature extractor is one of the best feature vector.

The performance of the two sequential dense are incredible! As you can imagine, work with a 1d array of 128 points it's different from a (250p x 250p x 3p) matrix. In my opinion, in object detection use a CNN is a must. In case of face recognition, the only way for have a "production-ready" face-recognition tool is work with the embedding of the faces.

I've push a first version of the FE module delegated to talk with the backend daemon of the new PyRecognizer. https://github.com/alessiosavi/go-pyrecognizer-fe As now, the new version of PyRecognizer have to be developed. So no interaction will be made between the FE and BE modules.

Do you know someone that can create the HTML/CSS frontend webpage for the following functionalities?

It's not necessary to execute particular logic, cause every logic is developed by the following function: https://github.com/alessiosavi/go-pyrecognizer-fe/blob/0b5be909ee3cfae501f3f0751380f770c86c6857/main.go#L52

ClashLuke commented 3 years ago

Unfortunately, I don't know any frontend developers, no.\ If I can help you with anything TensorFlow, please let me know.