google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.64k stars 5.07k forks source link

How do I smooth landmarks in face landmark detection (Python) ? #4927

Open sschoellhammer opened 10 months ago

sschoellhammer commented 10 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Windows 10

MediaPipe Tasks SDK version

0.10.7

Task name (e.g. Image classification, Gesture recognition etc.)

face mesh

Programming Language and version (e.g. C++, Python, Java)

Python

Describe the actual behavior

The landmarker movement should be smooth

Describe the expected behaviour

the landmarker movement is slightly jittery

Standalone code/steps you may have used to try to get what you need

When I run the landmarker web demo:
https://mediapipe-studio.webapps.google.com/studio/demo/face_landmarker

the result is perfectly smooth whereas when I do the same in python I get a slightly jittery result (on the same machine).
I read somewhere else that it should be possible to stabilize the result with mediapipe itself but I don't see how. Right now I'm doing my own smoothing but it would be nice to get the same out of the box!

My project is based on this example code:

import cv2
import mediapipe as mp
mp_drawing = mp.solutions.drawing_utils
mp_drawing_styles = mp.solutions.drawing_styles
mp_face_mesh = mp.solutions.face_mesh

# For webcam input:
drawing_spec = mp_drawing.DrawingSpec(thickness=1, circle_radius=1)
cap = cv2.VideoCapture(0)
with mp_face_mesh.FaceMesh(
    max_num_faces=1,
    static_image_mode=False,
    refine_landmarks=True,
    min_detection_confidence=0.5,
    min_tracking_confidence=0.5) as face_mesh:

  while cap.isOpened():
    success, image = cap.read()
    if not success:
      print("Ignoring empty camera frame.")
      # If loading a video, use 'break' instead of 'continue'.
      continue

    # To improve performance, optionally mark the image as not writeable to
    # pass by reference.
    image.flags.writeable = False
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    results = face_mesh.process(image)

    # Draw the face mesh annotations on the image.
    image.flags.writeable = True
    image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
    if results.multi_face_landmarks:
      for face_landmarks in results.multi_face_landmarks:
        mp_drawing.draw_landmarks(
            image=image,
            landmark_list=face_landmarks,
            connections=mp_face_mesh.FACEMESH_TESSELATION,
            landmark_drawing_spec=None,
            connection_drawing_spec=mp_drawing_styles
            .get_default_face_mesh_tesselation_style())
        mp_drawing.draw_landmarks(
            image=image,
            landmark_list=face_landmarks,
            connections=mp_face_mesh.FACEMESH_CONTOURS,
            landmark_drawing_spec=None,
            connection_drawing_spec=mp_drawing_styles
            .get_default_face_mesh_contours_style())
        mp_drawing.draw_landmarks(
            image=image,
            landmark_list=face_landmarks,
            connections=mp_face_mesh.FACEMESH_IRISES,
            landmark_drawing_spec=None,
            connection_drawing_spec=mp_drawing_styles
            .get_default_face_mesh_iris_connections_style())
    # Flip the image horizontally for a selfie-view display.
    cv2.imshow('MediaPipe Face Mesh', cv2.flip(image, 1))
    if cv2.waitKey(5) & 0xFF == 27:
      break
cap.release()

Other info / Complete Logs

No response

kuaashish commented 10 months ago

Hi @sschoellhammer,

You are currently utilizing the outdated face mesh solution, for which support has been discontinued. This functionality is now integrated into the new Face Landmarker Task API, as detailed in the documentation, and new Face Landmarker Task API is an enhanced version of the legacy face mesh, designed to be more straightforward for implementation in your specific use case.

To implement Face Landmarker in the Python, refer to the example code provided here and guide can be found here. If you encounter similar behaviour, please notify us promptly.

Thank you!

sschoellhammer commented 9 months ago

Hi @kuaashish , thanks so much for getting back to me :) Got it, I dug around and I got something to work with the new system. It is now nice and smooth but the markers in my test are in turn also lagging behind the video stream.

I have not seen parameters that specify the time window for the smoothing, are there any? Or is it simply a problem with my code? (possibly me handling the async-ness of it wrong)

Here it is, in case something obvious jumps to your attention:


import cv2
import mediapipe as mp

from mediapipe.tasks.python import vision
from mediapipe.tasks import python
from mediapipe.tasks.python.vision import FaceLandmarkerResult

class MyFaceDetectorAsync:
    def __init__(self):
        base_options = python.BaseOptions(model_asset_path='face_landmarker_v2_with_blendshapes.task')
        options = vision.FaceLandmarkerOptions(base_options=base_options,
                                               output_face_blendshapes=True,
                                               output_facial_transformation_matrixes=True,
                                               running_mode=mp.tasks.vision.RunningMode.LIVE_STREAM,
                                               num_faces=1,
                                               result_callback=self.result_callback)

        self.face_mesh = vision.FaceLandmarker.create_from_options(options)
        self.timestamp = 0
        self.image = None
        self.result = None
        self.eye_center = [0.5, 0.5]

    def get_center_of_eyes(self):
        left_eye_indices = [27, 23, 130, 133]
        right_eye_indices = [257, 362, 374, 263]

        self.eye_center = self.get_landmark_center(left_eye_indices + right_eye_indices)
        self.draw_circle(self.eye_center)

    def get_landmark_center(self, landmark_indices):
        x = 0
        y = 0
        for index in landmark_indices:
            x += self.landmarks[index].x
            y += self.landmarks[index].y
        x /= len(landmark_indices)
        y /= len(landmark_indices)
        return x, y

    def draw_circle(self, xy, color=(255, 0, 255), radius=3, thickness=1):
        center = (int(xy[0] * self.image_width), int(xy[1] * self.image_height))

        self.image = cv2.circle(self.image, center, radius, color, thickness)

    def draw_landmarks(self):
        for landmark in self.landmarks:
            self.draw_circle((landmark.x, landmark.y), radius=1, color=(255, 0, 0))

    def set_default_tracking_result(self):
        self.eye_center = [0.5, 0.5]

    def result_callback(self, result: FaceLandmarkerResult, output_image: mp.Image, timestamp_ms: int):
        self.result = result
        self.image = output_image

    def add_image(self, camera_image):
        self.image = camera_image
        self.timestamp += 1
        camera_image.flags.writeable = False
        img_rgb = mp.Image(image_format=mp.ImageFormat.SRGB, data=camera_image)
        self.face_mesh.detect_async(img_rgb, self.timestamp)

    def process_result(self):
        if self.result is None:
            return

        self.image_height, self.image_width, _ = self.image.shape

        if self.result.face_landmarks is None or len(self.result.face_landmarks) == 0:
            self.set_default_tracking_result()
            return

        self.landmarks = self.result.face_landmarks[0]

        self.draw_landmarks()
        self.get_center_of_eyes()

def main():
    cap = cv2.VideoCapture(0)
    detector = MyFaceDetectorAsync()

    while True:

        success, img = cap.read()

        detector.add_image(img)
        detector.process_result()

        if detector.image is not None:
            cv2.imshow("Image", detector.image)

        # Wait for 1 millisecond, and keep the window open

        cv2.waitKey(1)

if __name__ == "__main__":
    main()
kuaashish commented 9 months ago

Hi @schmidt-sebastian,

Are there configurations in the face landmarker to minimise jitter, or are there any methods available to reduce jitter?

Thank you

HyperScypion commented 6 months ago

About jittering of pose I've found this stack overflow post maybe it will help you: https://stackoverflow.com/questions/52450681/how-can-i-use-smoothing-techniques-to-remove-jitter-in-pose-estimation