Use pretrained ResNet50 model for Face Recognition on my own dataset

iamrishab commented 4 years ago

I want to build a face recognizer using the pretrained models given in the repository. Currently facing the issue with the distance threshold as with different faces the distances are coming very small whereas for same faces the distance is coming large, in most cases. My questions are:

Am I calculating the embedding correctly?
How face comparison should be done in order to reach LFW level accuracy?

I have already referenced this #6 and #8 but couldn't come up with a concrete solution.

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import cv2
import numpy as np
import tensorflow as tf

from .modules.models import ArcFaceModel
from .modules.utils import set_memory_growth, load_yaml, l2_norm

class ArcFaceResNet50:
    def __init__(self):
        set_memory_growth()
        self.cfg = load_yaml(os.path.join(os.path.dirname(os.path.abspath(__file__)), \
                                     './configs/arc_res50.yaml'))

        self.model = ArcFaceModel(size=self.cfg['input_size'],
                             backbone_type=self.cfg['backbone_type'],
                             training=False)

        ckpt_path = tf.train.latest_checkpoint(os.path.join(os.path.dirname(os.path.abspath(__file__)), \
                                                            './checkpoints/' + self.cfg['sub_name']))
        if ckpt_path is not None:
            print("[*] load ckpt from {}".format(ckpt_path))
            self.model.load_weights(ckpt_path)
        else:
            print("[*] Cannot find ckpt from {}.".format(ckpt_path))
            exit()

    def get_embeddings(self, frame_rgb, bounding_boxes):
            faces = []
            for x1, y1, x2, y2 in bounding_boxes:
                face_patch = frame_rgb[y1:y2, x1:x2, :]
                resized = cv2.resize(face_patch, (self.cfg['input_size'], self.cfg['input_size']), interpolation=cv2.INTER_AREA)
                normalize = resized.astype(np.float32) / 255.
                faces.append(normalize)
            faces = np.stack(faces)
            if len(faces.shape) == 3:
                faces = np.expand_dims(faces, 0)
            # Run prediction
            embeddings = l2_norm(self.model(faces))
            return embeddings

Thanks in advance!

peteryuX commented 4 years ago

Hi @iamrishab , nice to hear from you~ I briefly explain what I know, and hope that can help you.

The calculating of the embedding vector should like bellow (I thinks the code part you proved didn't have obvious mistake) Crop RGB format [0\~255] face image with alignment method -> re-scale to [0\~1] -> feed into pre-train model -> normalize output embedding vector
In the LFW dataset use threshold 1.32 to distinguish identity by l2-norm distance between embedding vectors.

iamrishab commented 4 years ago

Hi @peteryuX thank you for your response. I will surely try this out and revert in case of any further queries.

lixianwa commented 4 years ago

hi, @iamrishab I have some problem，how did you solve it ?

peteryuX / arcface-tf2

Use pretrained ResNet50 model for Face Recognition on my own dataset #17