google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://mediapipe.dev
Apache License 2.0
26.72k stars 5.08k forks source link

CalculatorGraph::Run() failed #5539

Closed matiaslopezd closed 3 weeks ago

matiaslopezd commented 1 month ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

Windows 10

Mobile device if the issue happens on mobile device

Thinkpad T14 Gen 1

Browser and version if the issue happens on browser

Google Chrome 126.0.x.x

Programming Language and version

Javascript

MediaPipe version

0.10.12

Bazel version

No response

Solution

ImageSegmenter

Android Studio, NDK, SDK versions (if issue is related to building in Android environment)

No response

Xcode & Tulsi version (if issue is related to building for iOS)

No response

Describe the actual behavior

Works but something for long duration session (>4 hrs) in a single tab throw this kind of error

Describe the expected behaviour

Hopefully have a way to listener errors to instance a new ImageSegmenter

Standalone code/steps you may have used to try to get what you need

Can't replicate because depends on long duration session, but sometimes we receive the error on our error ingestor from real customers/devices.

Could be related with memory management?

Other info / Complete Logs

Uncaught Error: INTERNAL: CalculatorGraph::Run() failed: Calculator::Open() for node "mediapipe_tasks_vision_image_segmenter_imagesegmentergraph__mediapipe.tasks.TensorsToSegmentationCalculator" failed: RET_CHECK failure (third_party/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc:350) shader_struct_ptr->program Problem initializing the activation program.; WaitUntilIdle failed === Source Location Trace: === third_party/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc:350 third_party/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc:405 third_party/mediapipe/tasks/cc/vision/image_segmenter/calculators/segmentation_postprocessor_gl.cc:331 third_party/mediapipe/tasks/cc/vision/image_segmenter/calculators/tensors_to_segmentation_calculator.cc:353 third_party/mediapipe/framework/calculator_node.cc:560 research/drishti/app/pursuit/wasm/graph_utils.cc:187
kuaashish commented 1 month ago

Hi @matiaslopezd,

Could you please provide the memory trace to facilitate a thorough review of the issue? To access it, navigate to Chrome Options -> More Tools -> Developer Tools -> Memory Tab, and select the third option Allocation Sampling and record the trace, as shown in the screenshot below.

Screenshot 2024-07-25 at 12 12 45 PM

Thank you!!

Siddharth-Latthe-07 commented 1 month ago

@matiaslopezd The error you're encountering with the MediaPipe ImageSegmenter in long-duration sessions could be related to memory management, resource exhaustion, or a potential memory leak in the underlying WebGL context.

Steps to Troubleshoot and Resolve:

  1. Monitor Resource Usage: Use browser developer tools to monitor memory usage over time. Look for signs of memory leaks or excessive memory consumption that could lead to failures after prolonged usage.

  2. Automatic Cleanup: Ensure that you are properly cleaning up resources when they are no longer needed. For example, if you are creating instances of ImageSegmenter, make sure to call the close method when they are no longer needed. Implement a mechanism to periodically clean up and reinitialize the ImageSegmenter.

3.Error Handling: Add error handling to catch and recover from errors. This can include reinitializing the ImageSegmenter instance when an error occurs. Use try-catch blocks around critical sections of code that involve the ImageSegmenter.

  1. Reinitialize on Error: Implement a mechanism to listen for errors and reinitialize the ImageSegmenter when an error is detected.

sample snippet for Error Handling and Reinitialization::-

let imageSegmenter;

function initializeImageSegmenter() {
  if (imageSegmenter) {
    imageSegmenter.close();
  }

  imageSegmenter = new ImageSegmenter({
    // Your configuration here
  });

  imageSegmenter.onResults(onResults);

  // Add error handling
  imageSegmenter.onError(handleError);
}

function handleError(error) {
  console.error("ImageSegmenter error:", error);

  // Attempt to reinitialize the ImageSegmenter
  initializeImageSegmenter();
}

function onResults(results) {
  // Process results
}

// Initialize the ImageSegmenter
initializeImageSegmenter();

// Example: Process an image
function processImage(image) {
  try {
    imageSegmenter.segment(image);
  } catch (error) {
    handleError(error);
  }
}

// Periodic cleanup and reinitialization (e.g., every hour)
setInterval(() => {
  console.log("Reinitializing ImageSegmenter for cleanup");
  initializeImageSegmenter();
}, 3600000); // 1 hour in milliseconds

Hope, This helps Thanks

kuaashish commented 1 month ago

Hi @matiaslopezd,

Could you please review the above https://github.com/google-ai-edge/mediapipe/issues/5539#issuecomment-2249581432 and provide us further information?

Thank you!!

tyrmullen commented 1 month ago

Open is only called once per graph, when the calculators are initialized. So that means that the error trace initially reported was not encountered by continuous running, per se, but rather was triggered by an initialization or reinitialization.

Since the activation shader is the first one created in the postprocessing (and one of the simplest), one possibility is that somehow you do not have an active WebGL context anymore. Check to make sure that your WebGL context is still active before initializing/reinitializing. In particular, for long-running tabs, the browser can recycle your WebGL context out from under you, which sounds perhaps like what's happening here. So make sure you handle or check for "WebGLContextLost" events.

To that end, the utility call isContextLost() may also be helpful here, and there is also an extension you can use for testing: WebGL_lose_context.

github-actions[bot] commented 1 month ago

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 3 weeks ago

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler[bot] commented 3 weeks ago

Are you satisfied with the resolution of your issue? Yes No