serengil / deepface

A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
https://www.youtube.com/watch?v=WnUVYQP4h44&list=PLsS_1RYmYQQFdWqxQggXHynP1rqaYXv_E&index=1
MIT License
14.31k stars 2.2k forks source link

Age Prediction being either 0 or 90+ #777

Closed DrPlanecraft closed 1 year ago

DrPlanecraft commented 1 year ago

I am currently working on a school project and I have taken deepface's code and compacted the code to only the stuff I needed to predict Age, Gender and Race.

my current code is working fine for Gender and Race, but my age is either 0 or 90+ most of the time, please advise on what I am doing wrong, thank you for your time! (Source Code is pasted below for demonstration video, click here)

import os
import cv2
import sys
import time
import math
import gdown
import warnings
import tensorflow
import numpy as np
import pandas as pd
from numba import jit
import seaborn as sns
from PIL import Image
import tensorflow as tf
from glob import glob, iglob
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix

tf_version = int(tf.__version__.split(".", maxsplit=1)[0])

if tf_version == 1:
    from keras.models import Model, Sequential
    from keras.layers import (
        Convolution2D,
        ZeroPadding2D,
        MaxPooling2D,
        Flatten,
        Dropout,
        Activation,
    )
else:
    from tensorflow.keras.models import Model, Sequential
    from tensorflow.keras.layers import (
        Convolution2D,
        ZeroPadding2D,
        MaxPooling2D,
        Flatten,
        Dropout,
        Activation,
    )

class VGGface:
    def __init__(self):
        self.home = os.getcwd() + "\deepface"

        print("Loading VGG Face Model . . .")
        self.VGGfaceModel = Sequential()
        self.VGGfaceModel.add(ZeroPadding2D((1, 1), input_shape=(224, 224, 3)))
        self.VGGfaceModel.add(Convolution2D(64, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(64, (3, 3), activation="relu"))
        self.VGGfaceModel.add(MaxPooling2D((2, 2), strides=(2, 2)))

        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(128, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(128, (3, 3), activation="relu"))
        self.VGGfaceModel.add(MaxPooling2D((2, 2), strides=(2, 2)))

        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(256, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(256, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(256, (3, 3), activation="relu"))
        self.VGGfaceModel.add(MaxPooling2D((2, 2), strides=(2, 2)))

        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(MaxPooling2D((2, 2), strides=(2, 2)))

        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(ZeroPadding2D((1, 1)))
        self.VGGfaceModel.add(Convolution2D(512, (3, 3), activation="relu"))
        self.VGGfaceModel.add(MaxPooling2D((2, 2), strides=(2, 2)))

        self.VGGfaceModel.add(Convolution2D(4096, (7, 7), activation="relu"))
        self.VGGfaceModel.add(Dropout(0.5))
        self.VGGfaceModel.add(Convolution2D(4096, (1, 1), activation="relu"))
        self.VGGfaceModel.add(Dropout(0.5))
        self.VGGfaceModel.add(Convolution2D(2622, (1, 1)))
        self.VGGfaceModel.add(Flatten())
        self.VGGfaceModel.add(Activation("softmax"))

        # -----------------------------------

        output = self.home + "/.deepface/weights/vgg_face_weights.h5"

        if os.path.isfile(output) != True:
            print("vgg_face_weights.h5 will be downloaded...")
            gdown.download("https://github.com/serengil/deepface_models/releases/download/v1.0/vgg_face_weights.h5", output, quiet=False)

        # -----------------------------------

        self.VGGfaceModel.load_weights(output)

        # -----------------------------------

        # TO-DO: why?
        self.vgg_face_descriptor = Model(inputs=self.VGGfaceModel.layers[0].input, outputs=self.VGGfaceModel.layers[-2].output)

    def AgeDetectionModel(self):

        model = self.VGGfaceModel

        # --------------------------

        classes = 101
        base_model_output = Sequential()
        base_model_output = Convolution2D(classes, (1, 1), name="predictions")(model.layers[-4].output)
        base_model_output = Flatten()(base_model_output)
        base_model_output = Activation("softmax")(base_model_output)

        # --------------------------

        age_model = Model(inputs=model.input, outputs=base_model_output)

        # --------------------------

        # load weights

        if os.path.isfile(self.home + "/.deepface/weights/age_model_weights.h5") != True:
            print("age_model_weights.h5 will be downloaded...")

            output = self.home + "/.deepface/weights/age_model_weights.h5"
            gdown.download("https://github.com/serengil/deepface_models/releases/download/v1.0/age_model_weights.h5", output, quiet=False)

        age_model.load_weights(self.home + "/.deepface/weights/age_model_weights.h5")

        return age_model

    def GenderDetectionModel(self):
        model = self.VGGfaceModel

        # --------------------------

        classes = 2
        base_model_output = Sequential()
        base_model_output = Convolution2D(classes, (1, 1), name="predictions")(model.layers[-4].output)
        base_model_output = Flatten()(base_model_output)
        base_model_output = Activation("softmax")(base_model_output)

        # --------------------------

        gender_model = Model(inputs=model.input, outputs=base_model_output)

        # --------------------------

        # load weights

        if os.path.isfile(self.home + "/.deepface/weights/gender_model_weights.h5") != True:
            print("gender_model_weights.h5 will be downloaded...")

            output = self.home + "/.deepface/weights/gender_model_weights.h5"
            gdown.download("https://github.com/serengil/deepface_models/releases/download/v1.0/gender_model_weights.h5", output, quiet=False)

        gender_model.load_weights(self.home + "/.deepface/weights/gender_model_weights.h5")

        return gender_model

    def RaceDetectionModel(self):
        model = self.VGGfaceModel

        # --------------------------

        classes = 6
        base_model_output = Sequential()
        base_model_output = Convolution2D(classes, (1, 1), name="predictions")(model.layers[-4].output)
        base_model_output = Flatten()(base_model_output)
        base_model_output = Activation("softmax")(base_model_output)

        # --------------------------

        race_model = Model(inputs=model.input, outputs=base_model_output)

        # --------------------------

        # load weights

        if os.path.isfile(self.home + "/.deepface/weights/race_model_single_batch.h5") != True:
            print("race_model_single_batch.h5 will be downloaded...")

            output = self.home + "/.deepface/weights/race_model_single_batch.h5"
            gdown.download("https://github.com/serengil/deepface_models/releases/download/v1.0/race_model_single_batch.h5", output, quiet=False)

        race_model.load_weights(self.home + "/.deepface/weights/race_model_single_batch.h5")

        return race_model

    def results2StringLabel_DeepFace(self, genderArray:np.ndarray, raceArray:np.ndarray, ageArray:np.ndarray):
        genderLabels = ("woman", "man")
        raceLabels = ("asian", "indian", "black", "white", "middle eastern", "latino hispanic")

        # Evaluate Gender Label
        for index, boolean in enumerate(genderArray[0]):
            if boolean:
                gender = genderLabels[index]

        # Evaluate Race Label
        for index, boolean in enumerate(raceArray[0]):
            if boolean:
                race = raceLabels[index]

        # Turn Age into Integer
        output_indexes = np.array(list(range(0, 101)))
        apparent_age = np.sum(ageArray * output_indexes)
        age = int(apparent_age)      

        return age, gender, race

    def loadModels(self):

        print("Loading Age Detection Model")
        self.ageDetection = self.AgeDetectionModel()
        print("Loading Gender Detection Model")
        self.genderDetection = self.GenderDetectionModel()
        print("Loading Race Detection Model")
        self.raceDetection = self.RaceDetectionModel()

        print("Model Loading Complete!")

    def predict(self, image:np.ndarray):
        image = cv2.resize(image, (224,224))
        image = np.reshape(image, (-1,224,224,3))

        ageResult = self.ageDetection.predict(image)
        genderResult = self.genderDetection.predict(image)
        raceResult = self.raceDetection.predict(image)

        age, gender, race = self.results2StringLabel_DeepFace(genderArray=genderResult,\
                                                              raceArray=raceResult,\
                                                                ageArray=ageResult)

        return age, gender, race

# OpenCV backend for deepface Face Detection
class OpenCV_FaceDetector:
    def __init__(self) -> None:
        # Get OpenCV Path.
        opencv_home = cv2.__file__
        folders = opencv_home.split(os.path.sep)[0:-1]

        self.opencv_path = "\\".join(folders)

        """
        path = folders[0]
        for folder in folders[1:]:
            self.opencv_path = path + "/" + folder

        - Windows Design only huh
        """

        # Initiate the detector dict. and build cascades to save on processing time later (hopefully)
        self.detector = {}
        self.detector["face_detector"] = self.build_cascade("haarcascade")
        self.detector["eye_detector"] = self.build_cascade("haarcascade_eye")

    def build_cascade(self, model_name="haarcascade"):

        if model_name == "haarcascade":
            face_detector_path = self.opencv_path + "\\data\\haarcascade_frontalface_default.xml"
            if os.path.isfile(face_detector_path) != True:
                raise ValueError(
                    "Confirm that opencv is installed on your environment! Expected path ",
                    face_detector_path,
                    " violated.",
                )
            detector = cv2.CascadeClassifier(face_detector_path)

        elif model_name == "haarcascade_eye":
            eye_detector_path = self.opencv_path + "\\data\\haarcascade_eye.xml"
            if os.path.isfile(eye_detector_path) != True:
                raise ValueError(
                    "Confirm that opencv is installed on your environment! Expected path ",
                    eye_detector_path,
                    " violated.",
                )
            detector = cv2.CascadeClassifier(eye_detector_path)

        else:
            raise ValueError(f"unimplemented model_name for build_cascade - {model_name}")

        return detector

    def detect_face(self, img, align=True):
        responses = []

        detected_face = None
        img_region = [0, 0, img.shape[1], img.shape[0]]

        faces = []
        try:
            # faces = detector["face_detector"].detectMultiScale(img, 1.3, 5)

            # note that, by design, opencv's haarcascade scores are >0 but not capped at 1
            faces, _, scores = self.detector["face_detector"].detectMultiScale3(
                img, 1.1, 10, outputRejectLevels=True
            )
        except:
            pass

        if len(faces) > 0:
            for (x, y, w, h), confidence in zip(faces, scores):
                detected_face = img[int(y) : int(y + h), int(x) : int(x + w)]

                if align:
                    detected_face = self.align_face(self.detector["eye_detector"], detected_face)

                img_region = [x, y, w, h]

                responses.append((detected_face, img_region, confidence))

        return responses

    def align_face(self,eye_detector, img):
        detected_face_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # eye detector expects gray scale image

        # eyes = eye_detector.detectMultiScale(detected_face_gray, 1.3, 5)
        eyes = eye_detector.detectMultiScale(detected_face_gray, 1.1, 10)

        # ----------------------------------------------------------------

        # opencv eye detectin module is not strong. it might find more than 2 eyes!
        # besides, it returns eyes with different order in each call (issue 435)
        # this is an important issue because opencv is the default detector and ssd also uses this
        # find the largest 2 eye. Thanks to @thelostpeace

        eyes = sorted(eyes, key=lambda v: abs(v[2] * v[3]), reverse=True)

        # ----------------------------------------------------------------

        if len(eyes) >= 2:
            # decide left and right eye

            eye_1 = eyes[0]
            eye_2 = eyes[1]

            if eye_1[0] < eye_2[0]:
                left_eye = eye_1
                right_eye = eye_2
            else:
                left_eye = eye_2
                right_eye = eye_1

            # -----------------------
            # find center of eyes
            left_eye = (int(left_eye[0] + (left_eye[2] / 2)), int(left_eye[1] + (left_eye[3] / 2)))
            right_eye = (int(right_eye[0] + (right_eye[2] / 2)), int(right_eye[1] + (right_eye[3] / 2)))
            img = self.alignment_procedure(img, left_eye, right_eye)
        return img  # return img anyway

    def alignment_procedure(self, img, left_eye, right_eye):

        # this function aligns given face in img based on left and right eye coordinates

        left_eye_x, left_eye_y = left_eye
        right_eye_x, right_eye_y = right_eye

        # -----------------------
        # find rotation direction

        if left_eye_y > right_eye_y:
            point_3rd = (right_eye_x, left_eye_y)
            direction = -1  # rotate same direction to clock
        else:
            point_3rd = (left_eye_x, right_eye_y)
            direction = 1  # rotate inverse direction of clock

        # -----------------------
        # find length of triangle edges

        a = self.findEuclideanDistance(np.array(left_eye), np.array(point_3rd))
        b = self.findEuclideanDistance(np.array(right_eye), np.array(point_3rd))
        c = self.findEuclideanDistance(np.array(right_eye), np.array(left_eye))

        # -----------------------

        # apply cosine rule

        if b != 0 and c != 0:  # this multiplication causes division by zero in cos_a calculation

            cos_a = (b * b + c * c - a * a) / (2 * b * c)
            angle = np.arccos(cos_a)  # angle in radian
            angle = (angle * 180) / math.pi  # radian to degree

            # -----------------------
            # rotate base image

            if direction == -1:
                angle = 90 - angle

            img = Image.fromarray(img)
            img = np.array(img.rotate(direction * angle))

        # -----------------------

        return img  # return img anyway

    def findEuclideanDistance(self, source_representation, test_representation):
        if isinstance(source_representation, list):
            source_representation = np.array(source_representation)

        if isinstance(test_representation, list):
            test_representation = np.array(test_representation)

        euclidean_distance = source_representation - test_representation
        euclidean_distance = np.sum(np.multiply(euclidean_distance, euclidean_distance))
        euclidean_distance = np.sqrt(euclidean_distance)
        return euclidean_distance

The code that is interfacing with my class. Above code is in a file named 'operations.py'

from operations import VGGface, OpenCV_FaceDetector
from glob import glob
import numpy as np
import cv2

VGGface = VGGface()
VGGface.loadModels()

faceDetector = OpenCV_FaceDetector()
"""
testImage = cv2.imread("UTKFace\\InTheWild_part1\\10_0_0_20170103233459275.jpg")

faces = faceDetector.detect_face(img=testImage,align=True)

# [<Detection Index>][<0:Detection Image, 1:Detection Cordinates>]
testImage1 = faces[0][0]

print(testImage)
print(type(testImage))

age, gender, race = VGG.predict(testImage1)

print(age,gender, race)

# Using cv2.imshow() method
# Displaying the image
cv2.imshow("test", testImage)

# waits for user to press any key
# (this is necessary to avoid Python kernel form crashing)
cv2.waitKey(0)

# closing all open windows
cv2.destroyAllWindows()
"""
print(glob("Videos\\DEMO_*_*_NG.mp4"))

for file in glob("Videos\\DEMO_*_*_NG.mp4"):
    # Create an object to read 
    # from File
    VideoFile_path = file
    video = cv2.VideoCapture(file)

    # We need to check if File
    # is opened previously or not
    if (video.isOpened() == False): 
        print("Error reading video file")

    # We need to set resolutions.
    # so, convert them from float to integer.
    size = (int(video.get(3)), int(video.get(4)))
    print(size)

    # Below VideoWriter object will create
    # a frame of above defined The output 
    # is stored in '*.mp4' file.
    result = cv2.VideoWriter(f'{file[:-4]}_preds.mp4',\
                            cv2.VideoWriter_fourcc(*'mp4v'),\
                            20.0, size)

    while(True):
        ret, frame = video.read()

        if ret == True:

            # Process the frame with bounding boxes
            processingFrame = faceDetector.detect_face(img=frame,align=True)

            # [<Detection Index>][<0:Detection Image, 1:Detection Cordinates>]
            for detection in processingFrame:
                predictedAge, predictedGender, predictedRace = VGGface.predict(detection[0])

                text_size, _ = cv2.getTextSize(f"Age: {predictedAge} | Gender: {predictedGender} | Race: {predictedRace}", cv2.FONT_HERSHEY_SIMPLEX, 1, 2)
                rectangle_width = text_size[0] + 10
                rectangle_height = text_size[1] + 40

                print(detection[1])

                # Draw Bounding boxes and display predictions
                # x1,y1 = start_point
                # y1 + x2,y1 + y2 = end_point
                #                     frame        x1               y1                           x2                                  y2                    B   G  R  line Thickness
                frame = cv2.rectangle(frame, (detection[1][0], detection[1][1]), (detection[1][0] + detection[1][2], detection[1][1] + detection[1][3]), (255, 0, 0), 2)

                # Draw Bounding Boxes Around text
                frame = cv2.rectangle(frame, (detection[1][0], detection[1][1]),\
                                    (detection[1][0] + rectangle_width, detection[1][1] - rectangle_height), (255, 0, 0), -1)

                # Displaying Text
                frame = cv2.putText(frame, f"Age: {predictedAge} | Gender: {predictedGender} | Race: {predictedRace}",\
                                    (detection[1][0], detection[1][1] - 20),\
                                    cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 0, 255), 2)

            # Display the frames
            # saved in the file
            cv2.imshow('Frame', frame)

            # Write the frame into the
            # file '*.mp4'
            result.write(frame)

            # Press S on keyboard 
            # to stop the process
            if cv2.waitKey(1) & 0xFF == ord('s'):
                break

        # Break the loop
        else:
            break

    # When everything done, release 
    # the video capture and video 
    # write objects
    video.release()
    result.release()

    # Closes all the frames
    cv2.destroyAllWindows()

    print(f"The video {VideoFile_path} was successfully saved")

Any and all help will be appreciated.

serengil commented 1 year ago

You just need to call DeepFace.analyze. This handles preprocessing steps such as normalization. Most probably you are missing these steps. Recommend you to use this function directly.

DrPlanecraft commented 1 year ago

Thank You for the information, I will re-evaluate how I use DeepFace. At the current juncture I would only like to use a section and not the library as a whole to hopefully keep the size of my project as small as humanly possible. It has been a good learning experience looking through and seeing how everything is wrapped and presented.

DrPlanecraft commented 1 year ago

on another note, I would like to ask about the two statements checking the image shape. I see 2 similar statements and a convert color statement.

for current_img, current_region, confidence in face_objs:
        if current_img.shape[0] > 0 and current_img.shape[1] > 0: #here
            if grayscale is True:
                current_img = cv2.cvtColor(current_img, cv2.COLOR_BGR2GRAY)

            # resize and padding
            if current_img.shape[0] > 0 and current_img.shape[1] > 0: # Here

Is there any way that the cvtColor function will reshape the array in a way that it is of shape (0,0)?

serengil commented 1 year ago

this is mentioned in this issue: https://github.com/serengil/deepface/issues/679

it will be fixed in the next release.

DrPlanecraft commented 1 year ago

Thank You for your Help in pointing out Where I have went wrong, I have now fixed the issue that is being discussed in this thread.

turns out that the nan was being caused by these 3 lines inside my code:

image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (224,224))
image = np.reshape(image, (-1,224,224,3))

and the sanitization of the data