Regarding the identification of the unregistered face

arsfutura / face-recognition

A framework for creating and using a Face Recognition system.

BSD 3-Clause "New" or "Revised" License

146 stars 49 forks source link

Regarding the identification of the unregistered face #26

Closed yong2khoo-lm closed 3 years ago

yong2khoo-lm commented 3 years ago

@ldulcic

Hi~ I have followed your suggestions in my previous posts, and have obtained a quite decent result :) However, the question of the 'unregistered face' puzzles me.

Currently, with the samples of around 30 faces/person, a test of around 25 persons, could produce a very high accuracy.

But the problem comes when there is a 26th unregistered person. He/She may be mis-recognized as one of the 25 persons.

From my understanding, with an unregistered face, any classification model would have similar problem. (correct me if i am wrong) So, after the classification process, I did additional checking with the registered embeddings. If most of them exceed an Euclidean distance threshold, he will be classified 'unknown'.

But the problem is, this kind of 'checking', may not have desirable results.

Do you have better ideas?

ldulcic commented 3 years ago

Hi @yong2khoo-lm!

Congrats, you just discovered the hardest problem of face recognition! 🙂 I hate to disappoint you, but there is no simple solution, this is actually an open-research problem.

This kind of classification is called open-set classification. Few links describing the problem: Open-set classification Recent Advances in Open Set Recognition: A Survey There are all kinds of fancy-new-models for open-set recognition. I didn't really have time to test most of them. I tested SVM and One-Class SVM, they weren't performing good at all.

What I do to avoid this problem? I compiled a good test set with a lot of unknown faces → I fix my threshold at 80% and find optimal value for Softmax C hyperparameter. This is not ideal but it works really well for my face recognition project. This is a really hard problem IMO, it's impossible to avoid it. I could probably get better performance with some open-set classifier but I just don't have enough time to test them out. If you happen to try them out at some point, please let me know how did it go.

crypt0miester commented 3 years ago

Hey @yong2khoo-lm,

What I did was detect if the confidence is above a certain value. i.e. 75% or 0.75

in arsfutura's system it is using face.top_prediction.confidence.

I am still early in development. did not test it with a big number of classes.

here is my code:

start = time.perf_counter()
# load model
with open(newest_model, 'rb') as f:
      face_recognizer = joblib.load(f)

# load image, in this case I load it in base64 encoded bytes
data_uri = base64.b64decode(data_uri)

# convert from uri to image
data_image = util.data_uri_to_cv2_img(data_uri) 
img = Image.fromarray(data_image)
faces = face_recogniser(img)

if len(faces) != 0:
  for face in faces:
      if face.top_prediction.confidence > 0.75:
          person = face.top_prediction.label
          text = "%s %.2f%%" % (person, face.top_prediction.confidence * 100)
          print(f'found someone - {text}. elapsed {time.perf_counter() - start}')
      else: 
          print(f'bad confidence on this guy {person}')
          print('unknown', f'elapsed {time.perf_counter() - start}')

yong2khoo-lm commented 3 years ago

@ldulcic Thanks for the big picture and your sharing :) @crypt0miester Thanks for your sample code :)

Guess it would take me some time to look into this. I shall close this first, and would comment here when I am ready~