google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
27.6k stars 5.16k forks source link

Memory Leak in the HandLandmarker object using the GPU Delegate(Python M1 MacOS) #5652

Open kyleellefsen opened 1 month ago

kyleellefsen commented 1 month ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Mac OS

Mobile device if the issue happens on mobile device

No response

Browser and version if the issue happens on browser

No response

Programming Language and version

Python

MediaPipe version

0.10.15

Bazel version

No response

Solution

HandLandmarker

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Running out of RAM and crashing

Describe the expected behaviour

not crashing

Standalone code/steps you may have used to try to get what you need

See https://github.com/google-ai-edge/mediapipe/issues/5652#issuecomment-2379932394 below

kyleellefsen commented 1 month ago

When I run the hand landmarker in python (see code below) on my GPU, my memory fills up until the operating system kills the python process. The entire script below takes ~5 seconds to run on my computer. (This seems similar to https://github.com/google-ai-edge/mediapipe/issues/5626 (reported 2 weeks ago))

My computer

My python setup

Minimum reproducible example (memleak.py):

import time
import numpy as np
import mediapipe

def main_loop(landmarker):
    frame = np.random.randint(0, 255, (1080, 1920, 3), dtype=np.uint8)
    mp_image = mediapipe.Image(image_format=mediapipe.ImageFormat.SRGBA, data=frame)
    landmarker.detect_async(mp_image, int(1000*time.time()))

if __name__ == "__main__":
    options = mediapipe.tasks.vision.HandLandmarkerOptions(
        base_options=mediapipe.tasks.BaseOptions(
            model_asset_path='hand_landmarker.task',
            delegate=mediapipe.tasks.BaseOptions.Delegate.GPU),
        running_mode=mediapipe.tasks.vision.RunningMode.LIVE_STREAM,
        result_callback=lambda x, y, z: None)
    landmarker = mediapipe.tasks.vision.HandLandmarker.create_from_options(options)
    for idx in range(800):
        main_loop(landmarker)
> python memleak.py
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1727466093.454681 28633460 gl_context.cc:357] GL version: 2.1 (2.1 Metal - 88.1), renderer: Apple M1 Pro
INFO: Created TensorFlow Lite delegate for Metal.
W0000 00:00:1727466093.540934 28633535 landmark_projection_calculator.cc:186] Using NORM_RECT without IMAGE_DIMENSIONS is only supported for the square ROI. Provide IMAGE_DIMENSIONS or use PROJECTION_MATRIX.
fish: Job 1, 'python memleak.py' terminated by signal SIGKILL (Forced quit)

My memory spikes when I run this, and stays high until the program closes. If I extend the loop in the code above from 800 to 10000, the program is shut down by the OS. I can run it in an interactive environment (ipython) and I can see that the memory stays high until I del landmarker, at which point all the memory is released. From this I can state with certainty that the problem is with the HandLandmarker object. Also, if I switch the delegate to Delegate.CPU, there is no problem.

image
kyleellefsen commented 1 month ago

In https://github.com/google-ai-edge/mediapipe/issues/5626, I noticed the running_mode was set to VIDEO, whereas this bug has a problem with LIVE_STREAM. I tested out both (see code below) and noticed that neither of them release memory automatically. I also tested the IMAGE running mode (not shown) and it has the same problem.

import time
import numpy as np
import mediapipe

def run_video():
    options = mediapipe.tasks.vision.HandLandmarkerOptions(
        base_options=mediapipe.tasks.BaseOptions(
            model_asset_path='hand_landmarker.task',
            delegate=mediapipe.tasks.BaseOptions.Delegate.GPU),
        running_mode=mediapipe.tasks.vision.RunningMode.VIDEO)
    landmarker = mediapipe.tasks.vision.HandLandmarker.create_from_options(options)
    for idx in range(800):
        frame = np.random.randint(0, 255, (1080, 1920, 3), dtype=np.uint8)
        mp_image = mediapipe.Image(image_format=mediapipe.ImageFormat.SRGBA, data=frame)
        landmarker.detect_for_video(mp_image, int(1000*time.time()))
    return landmarker

def run_live():
    options = mediapipe.tasks.vision.HandLandmarkerOptions(
        base_options=mediapipe.tasks.BaseOptions(
            model_asset_path='hand_landmarker.task',
            delegate=mediapipe.tasks.BaseOptions.Delegate.GPU),
        running_mode=mediapipe.tasks.vision.RunningMode.LIVE_STREAM,
        result_callback=lambda x, y, z: None)
    landmarker = mediapipe.tasks.vision.HandLandmarker.create_from_options(options)
    for idx in range(800):
        frame = np.random.randint(0, 255, (1080, 1920, 3), dtype=np.uint8)
        mp_image = mediapipe.Image(image_format=mediapipe.ImageFormat.SRGBA, data=frame)
        landmarker.detect_async(mp_image, int(1000*time.time()))
    return landmarker

if __name__ == '__main__':
    landmarker_video = run_video()
    del landmarker_video  # memory is released only after landmarker is deleted
    landmarker_live = run_live()
    del landmarker_live  # same here