oracle / macest

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores
Universal Permissive License v1.0
100 stars 20 forks source link

How to use MACEst in problem of face similarity? #5

Open Ivan-basis opened 2 years ago

Ivan-basis commented 2 years ago

Hi. I'm working on a problem of face recognition. We inspect a selfy-with-passport pictures and detect faces on it. The task is to estimate probability that 2 detected faces from an image belong to one person (or not).

The trivial solution is to encode each face into some vector (I use feature extracting network that outputs vectors of 512 values for each face) and then calculate cosine similarity along these vectors. This metric usually provides values around 0.35-0.45 for the same person's faces. When comparing these faces with faces of other people we get lower similarity, as expected, but not much lower.

The point is that I want to move from such obscure values of ~0.35 to any kind of probability with ~0.9+ for faces of the same person and with significantly lower values for different persons. Obviously, this is not a regular classification problem because we have (potentially) infinite number of "classes" (persons). This is closer to regression problem but I don't know how to handle it.

At this moment I have a trained face encoder and a database of ~6k face crops from ~4.5k persons. Some persons have more than 1 face in this db, but some - just one. And each new picture I get almost always contains a new person.

  1. Is MACEst applicable at this problem?
  2. How to use it if so?

Any suggestions would be appreciated! Thanks

P.S. I understand that cosine similarity produces values in -1..1 range, but simple translation to 0..1 range just shifts 0.35->0.675 which is not enough.

mrowebot commented 1 year ago

This sounds like a really sparse problem which I am not sure that MACEst would be well-suited to at present. We know, for instance, that it doesn't perform as well in sparse text spaces.