behzadshomali / Image-Describe-Pipe

This app outputs the name, coordinations, sentiment of each extracted face, and besides a brief description of the scene's context for each input image.
2 stars 0 forks source link

Do research on the problematic #1

Closed behroozomidvar closed 3 years ago

behroozomidvar commented 3 years ago
behzadshomali commented 3 years ago

@behroozomidvar first, take a look at following comparison between the face verification and face recognition problems:

Verification

Recognition

By having above comparison, we can conclude that the problem we're trying to solve, is Face Recognition. Source: Convolutional Neural Networks, Andrew Ng

behroozomidvar commented 3 years ago

@behzadshomali I totally agree with you. Face recognition it is. Can you please elaborate more on the ID? Is it simply an auto increment in a DBMS, or there is more into it? What does it represent?

The next step is to investigate different options for face recognition.

behzadshomali commented 3 years ago

@behroozomidvar by ID I meant that we should provide some information determining the intended person ). In my humble opinion, there is no need to put extra information into ID; cause we will keep extra information in the form of other attributes in our database tables. By having these in mind, as you mentioned, one proper option would be an auto incremental ID in our using DBMS. Do you have any other thoughts in your mind?

behroozomidvar commented 3 years ago

Makes perfect sense. Proceed.

behzadshomali commented 3 years ago

@behroozomidvar face recognition is done by implementing various methods such as:

Based on the geometric features * The most intuitive approach to face recognition * Marker points (position of eyes, ears, ...) were used to build a feature vector * Calculating the euclidean distance between feature vectors of a probe and reference image * Robust against changes in illumination by its nature * Drawback: the accurate registration of the marker points is complicated
The Eigenfaces method * A point from a high-dimensional image space and a lower-dimensional representation is found, where classification becomes easy * The lower-dimensional subspace is found with Principal Component Analysis (PCA) * The basic idea is to minimize the variance within a class, while maximizing the variance between the classes at the same time

Source: Face Recognition, OpenCV

behzadshomali commented 3 years ago

@behroozomidvar I've recently read about an approach called "One Shot Learning" in which the network is not learning to classify an image directly to any of the output classes; rather, it is learning a similarity function, which takes two images as input and expresses how similar they are. This will be useful, why we may not access tons of pictures of a same person in different situations,

This approach will work fine with Siamese Networks. Of course in case we decide to implement the model(solution to our problem) from the scratch.

behroozomidvar commented 3 years ago

Please also provide a comparison between the geometric and eigenfaces methods regarding the following points:

behroozomidvar commented 3 years ago

@behroozomidvar I've recently read about an approach called "One Shot Learning" in which the network is not learning to classify an image directly to any of the output classes; rather, it is learning a similarity function, which takes two images as input and expresses how similar they are. This will be useful, why we may not access tons of pictures of a same person in different situations,

This approach will work fine with Siamese Networks. Of course in case we decide to implement the model(solution to our problem) from the scratch.

Nice initiative. But of course you know one ideal in this project is being fast and commodity-based. So we try to prevent writing from scratch as much as possible.

behzadshomali commented 3 years ago

@behroozomidvar by a instant review I got that "eigenfaces" method doesn't work robustly in practice especially when the person is wearing glasses or the background is highly textured.

By a quick explore in GiHub, I found several repositories implementing face recognition algorithms using eignefaces. But on the other hand I couldn't find repos necessarily using geometric methods.

But in my humble opinion, since we're looking for being fast and commodity-based, we don't need to spend time on figuring out these methods differences; we can look for available libraries and choose the best one based on its efficiency.

behroozomidvar commented 3 years ago

So the dilemma is that eigenfaces is less efficient and geometric is less available. Right?

But in my humble opinion, since we're looking for being fast and commodity-based, we don't need to spend time on figuring out these methods differences; we can look for available libraries and choose the best one based on its efficiency.

I agree with this. But what does it entail? What would be the next step for that?

behzadshomali commented 3 years ago

@behroozomidvar actually there is no dilemma, cause we don't have to choose only between eigenfaces and geometric. In fact they are two main group of algorithms; by this, I mean we shouldn't really care about them; since the available libraries use them as their backbone. So the only thing we should care about is how good various libraries are working, find out their pros and cons and in the end make our final decision.

As our next step, I can search over GitHub and other open-source platforms to:

  1. figure out what is the most conventional method/library used by people?
  2. provide some information about well-known libraries
behroozomidvar commented 3 years ago

Great. Please create appropriate issues for the aforementioned steps.

behzadshomali commented 3 years ago

Sure, meanwhile, we covered almost every aforementioned steps in #2