Closed argenos closed 4 years ago
This is for recognizing i.e. members of the team and say their names or follow?
Yeah, the action after recognizing the person will vary from task to task, but that is indeed the first part of the problem. The second thing to consider is that we might want to recognize people without associating them with a name, i.e. tracking persons inside the appartment and being able to distinguish between them
I was thinking of bag of features may be simpler
Yeah, it's definitely simpler; I don't know how the accuracies compare though (as the set of people grows).
I'd like to add a few comments here that we might need to consider: It's important to be able to recognize the role as well, e.g. post-man, delivery man, police, etc. so we might want to do something that not only relies on the face of the person but also on their clothing.
How's the progress on this? With Single Shot Multibox, detecting people seems to be quite reliable. As for classifying roles, genders, and emotions, I think we can use the same ImageRecognition
state currently used in perceive_plane_action
. The image recognition service that the action interacts with takes a model name in its request. We can specify which model to use here.
Detection is not the only thing here though; it's important to recognise the actual person as well (so learning from one/only a few examples is needed).
I think it was probably abandoned. Feel free to open a pull request if you’re interested!
@alex-mitrevski can't we do a binary classifier trained on a single pic of one person? @argenos yeah I'll start putting something together for this.
I suppose we could use something like the examplar SVMs described here (in this case, they're used for detection, but we could use the same idea for recognition) - or the Siamese networks I suggested above.
I believe there was a person at HBRS computer science working on this subject. As far as I can remember he simply used a pre-train network (e.g. VGG16) took the output of a resized image of the face and compared it against each sample of a set of saved face features using cosine similarity. You can extend this by using data augmentation of the sampled image to obtain a more robust description.
I recently noticed in the PR list of @oarriaga 's face_classification
repo a recommendation to use dlib
instead of OpenCV for face detection which sounds interesting: oarriaga/face_classification#25.
Additionally, I believe the classification models trained by Octavio can be loaded into ImageRecognitionServer
like with object recognition. Then the will be no extra code needed for gender and emotion classification. All you'll need in the repo are the model files. The action then can simply request the service with the appropriate model name, and a classification result will be returned.
Should this be closed in favour of #152?
A unique discussion here is the mention of Siamese networks for learning new faces, which is a feature for the person identity model. The other issue deals with refactoring and architecture.
A Siamese network implementation was added with https://github.com/b-it-bots/dataset_interface/pull/11
Recognition based on extracted face embeddings was added with https://github.com/b-it-bots/mas_knowledge_base
I'll close the issue since these two PRs basically resolve it.
A good place to start is to take a look at the
detect_person
andgender_recognition
actions, the task would be to create an action just like these.We don't necessarily need to use this in the end, but opencv obviously has things for this. Any other interesting approach is also welcome.