Issue with mediapipe when loading frames from a video #1261

Closed kcmcveigh closed 1 week ago

kcmcveigh commented 1 week ago

DeepFace's version

0.0.92


Python version

3.10


Operating System

macOS

Reproducible example

import cv2
from deepface import DeepFace

# Path to the video file
video_path = '../example_data/'
# Process the video
cap = cv2.VideoCapture(video_path)

# Define the codec and create VideoWriter object
fourcc = cv2.VideoWriter_fourcc(*'mp4v')

frame_idx = 0
while cap.isOpened():
    ret, frame =
    if not ret:

    # Convert the BGR image to RGB
    rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

 # test iterations
bgr_wo_align = DeepFace.extract_faces(frame, detector_backend='mediapipe', enforce_detection=False, align = False)
bgr_aligned = DeepFace.extract_faces(frame, detector_backend='mediapipe', enforce_detection=False, align = True)

rgb_wo_align = DeepFace.extract_faces(rgb_frame, detector_backend='mediapipe', enforce_detection=False, align = False)
rgb_aligned = DeepFace.extract_faces(rgb_frame, detector_backend='mediapipe', enforce_detection=False, align = True)

print('bgr frame wo alignment does not detect',bgr_wo_align[0]['confidence'])
print('bgr frame w alignment does not detect face',bgr_wo_align[0]['confidence'])
print('rgb frame wo alignment detects face',rgb_wo_align[0]['confidence'])
print('rgb frame w alignment does not detect face',rgb_aligned[0]['confidence'])

Relevant Log Output

bgr frame wo alignment does not detect 0 bgr frame w alignment does not detect face 0 rgb frame wo alignment detects face 0.74 rgb frame w alignment does not detect face 0

Expected Result

I would expect the extract_faces to work with a bgr frame as suggested in the doc string (although mediapipe seems to expect a rgb frame). I also wouldn't expect alignment to hurt performance to this extent?

What happened instead?

extract_faces only seems to work when the frame is rgb and alignment is turned off

Additional Info

serengil commented 1 week ago

when you feed a numpy array - does not matter rgb or bgr, we are passing it to backend detector as

so, if you have any trouble, then you should raise this issue in the mediapipe's repo instead of deepface.

kcmcveigh commented 1 week ago

I'm still a bit confused as media pipe seems to expect rgb frames:

While the last line of the linked utility function returns bgr frames?

return img_obj_bgr, img

Shouldn't we pass rgb frames to mediapipe? Thanks for the help!

serengil commented 1 week ago

you are not jumping that line if it is numpy

kcmcveigh commented 1 week ago

Ahh I see now thank you! So is it fair to say the extract_faces function assumes you pass numpy arrays in the format (RGB, BGR) the detector expects, and that this should be done before passing a numpy array to the extract_faces function?

serengil commented 1 week ago

if numpy array is passed, yes.

i cannot understand the given image is bgr or rgb.

kcmcveigh commented 1 week ago

So in my example I load frames with cv2 from a video. By default they're loaded as bgr. In the sample code when these frames are passed to the extract_faces function it fails because mediapipe expects rgb. When I convert the frames to bgr in the sample code above extract_faces works as this is what mediapipe expects. Interestingly if align is true the function fails again.

If this is the intended functionality then I think this docstring in on the extract_faces function should be updated?

Args: img_path (str or np.ndarray): Path to the first image. Accepts exact image path as a string, numpy array (BGR), or base64 encoded images.