google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.7k stars 5.07k forks source link

Pose not working very good for black people #5091

Closed ccornelis closed 5 months ago

ccornelis commented 7 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

iOS

MediaPipe version

GoogleMLKit/PoseDetectionAccurate 3.2.0 (MediaPipe solution not available yet on iOS)

Bazel version

No response

Solution

Pose

Programming Language and version

Swift

Describe the actual behavior

Recognising poses for white/caucasian sprinters works good, for black sprinters it's pretty bad

Describe the expected behaviour

Should work good for black sprinters too

Standalone code/steps you may have used to try to get what you need

I made an iOS app that uses Pose landmarks for sprinters. Every frame of a short video is passed to Pose (as a stream of images) and for white sprinters the app works pretty good.
On the other hand, a recorded video of black sprinters made at the same time, same background, same lighting barely works. 
I tested the app with 6 different backgrounds and lighting conditions. The result was always the same: good results for white athletes, bad results for black athletes.

Will the upcoming Mediapipe Pose for iOS have better support for people of color?

Other info / Complete Logs

No response

kuaashish commented 7 months ago

Hi @ccornelis,

Thank you for bringing this matter to our attention. We kindly request you to provide any relevant reference images or videos that would aid in a more effective comparison for both scenarios. This information will greatly assist us in comprehending the issue thoroughly and facilitating internal discussions with our team to enhance overall performance.

Thank you

github-actions[bot] commented 7 months ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

ccornelis commented 7 months ago

The mail also bounced with just 1 video file. Is there another way to send these videos to you?

On 4 Feb 2024, at 17:54, Cornelis Chris @.***> wrote:

The mail bounced because the attachements are too big. Trying with just one file …

<2024-01-15 1210.mp4>

On 4 Feb 2024, at 15:18, Cornelis Chris @.***> wrote:

Hi,

Sorry for the late response. Attached you find 2 videos for which Pose finds no poses or just a couple of poses at the very end of the clip.

Best,

Chris.

<2024-01-15 1209.mov> <2024-01-15 1210.mp4>

On 25 Jan 2024, at 06:27, kuaashish @.***> wrote:

Hi @ccornelis https://github.com/ccornelis,

Thank you for bringing this matter to our attention. We kindly request you to provide any relevant reference images or videos that would aid in a more effective comparison for both scenarios. This information will greatly assist us in comprehending the issue thoroughly and facilitating internal discussions with our team to enhance overall performance.

Thank you

— Reply to this email directly, view it on GitHub https://github.com/google/mediapipe/issues/5091#issuecomment-1909380821, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAOZ5BKYVDZMX75KYZSWXWTYQHUKNAVCNFSM6AAAAABCJRRDUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBZGM4DAOBSGE. You are receiving this because you were mentioned.

kuaashish commented 7 months ago

Hi @ccornelis,

Please upload the document to Google Drive and provide us with the accessible link. Thank you!!

ccornelis commented 6 months ago

https://drive.google.com/open?id=1WnI45GyA1WifIHt749rRyV9dIrfNqbXU&usp=drive_fs

https://drive.google.com/open?id=1WmIG3Fr8ek6C1gOXkElIoYnPCer9t_43&usp=drive_fs

kuaashish commented 6 months ago

Hi @ccornelis,

I apologize for the delay in my response. The links you provided are currently inaccessible. Could you please share a accessible link? Thank you

ccornelis commented 6 months ago

Sorry for that. Should work now.

https://drive.google.com/file/d/1WnI45GyA1WifIHt749rRyV9dIrfNqbXU/view?usp=sharing

https://drive.google.com/file/d/1WmIG3Fr8ek6C1gOXkElIoYnPCer9t_43/view?usp=sharing

schmidt-sebastian commented 6 months ago

We are following up with out model team. Thanks for bringing this to our attention.

ayushgdev commented 6 months ago

Hello @ccornelis

It seems like the camera was farther than 4 meters from the sprinter. If you check the model card for Pose detection Page 2, Out of scope application, the model does not work for subjects more than 14feet/4 meters away.

To solve this, a simple solution can be to crop the image/video frame. As a test, please check the following code excerpt (the video frame at 00:03 was used for the test):

from mediapipe import solutions
from mediapipe.framework.formats import landmark_pb2
import numpy as np

def draw_landmarks_on_image(rgb_image, detection_result):
  pose_landmarks_list = detection_result.pose_landmarks
  annotated_image = np.copy(rgb_image)

  # Loop through the detected poses to visualize.
  for idx in range(len(pose_landmarks_list)):
    pose_landmarks = pose_landmarks_list[idx]

    # Draw the pose landmarks.
    pose_landmarks_proto = landmark_pb2.NormalizedLandmarkList()
    pose_landmarks_proto.landmark.extend([
      landmark_pb2.NormalizedLandmark(x=landmark.x, y=landmark.y, z=landmark.z) for landmark in pose_landmarks
    ])
    solutions.drawing_utils.draw_landmarks(
      annotated_image,
      pose_landmarks_proto,
      solutions.pose.POSE_CONNECTIONS,
      solutions.drawing_styles.get_default_pose_landmarks_style())
  return annotated_image

import cv2
import numpy as np
from google.colab.patches import cv2_imshow

img = cv2.imread("sprinter.png")
img = img[200:img.shape[0]-200, 200:img.shape[1]-200, :]
cv2_imshow(img)

# STEP 1: Import the necessary modules.
import mediapipe as mp
from mediapipe.tasks import python
from mediapipe.tasks.python import vision

# STEP 2: Create an PoseLandmarker object.
base_options = python.BaseOptions(model_asset_path='pose_landmarker.task')
options = vision.PoseLandmarkerOptions(
    base_options=base_options,
    output_segmentation_masks=True)
detector = vision.PoseLandmarker.create_from_options(options)

# STEP 3: Load the input image.
image = mp.Image(image_format=mp.ImageFormat.SRGB, data=np.array(img[:,:, ::-1]))

# STEP 4: Detect pose landmarks from the input image.
detection_result = detector.detect(image)

# STEP 5: Process the detection result. In this case, visualize it.
annotated_image = draw_landmarks_on_image(image.numpy_view(), detection_result)
cv2_imshow(cv2.cvtColor(annotated_image, cv2.COLOR_RGB2BGR))

The result is given below: image

github-actions[bot] commented 6 months ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 5 months ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 5 months ago

Are you satisfied with the resolution of your issue? Yes No