PoseLandmarker's segmentation is blank when using delegate: GPU

cristobalbahe commented 1 year ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

Yes

OS Platform and Distribution

Browser (chrome)

MediaPipe Tasks SDK version

0.10.4

Task name (e.g. Image classification, Gesture recognition etc.)

PoseLandmarker

Programming Language and version (e.g. C++, Python, Java)

JavaScript

Describe the actual behavior

outputSegmentation is blank

Describe the expected behaviour

outputSegmentation has a mask as a webGL texture that is not blank

Standalone code/steps you may have used to try to get what you need

Hello everyone, I am trying to debug the segmentation mask result of PoseLandmarker, but the result texture is blank. I initialize my vision task as the following

const vision = await FilesetResolver.forVisionTasks(
  "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm"
);
const poseLandmarker = await PoseLandmarker.createFromOptions(vision, {
  baseOptions: {
    modelAssetPath: "/pose_landmarker_full.task",
    delegate: "GPU",
  },
  runningMode: "VIDEO",
  outputSegmentationMasks: true,
});

Then I am running a renderLoop as such:

function renderLoop() {
  let startTimeMs = performance.now();
  if (video.currentTime !== lastVideoTime) {
    poseLandmarker.detectForVideo(video, startTimeMs, (result) => {
      if (result.segmentationMasks.length) {

        gl.viewport(0, 0, width, height);
        gl.useProgram(program);
        gl.activeTexture(gl.TEXTURE0);
        gl.bindTexture(
          gl.TEXTURE_2D,
          result.segmentationMasks[0].getAsWebGLTexture()
        );

        if (debug) {
          console.log(result.segmentationMasks[0].getAsFloat32Array());
        }
      }
    });
    lastVideoTime = video.currentTime;
  }

  requestAnimationFrame(() => {
    renderLoop();
  });
}

My WebGL canvas which is just rendering the WebGL Texture is not showing anything, so for debugging purposes I console logged result.segmentationMasks[0].getAsFloat32Array() to see if I was getting any values, but it turns out it's a 0 filled Float32Array. (Also, I noticed that the landmarks are correct when I draw them with drawingUtils, so the inference is working properly)

I changed the delegate to CPU to see if I got a different result and, indeed, I got non 0 values inside the float32array, so I was wondering how could I get the segmentation mask with GPU as delegate.

Thank you so much in advance



### Other info / Complete Logs

_No response_

gtnbssn commented 9 months ago

Possibly related to this.

I am using version 0.10.7 and am creating the landmarker in this way:

  const poseLandmarker = await PoseLandmarker.createFromOptions(vision, {
    baseOptions: {
      modelAssetPath: "/pose_landmarker_full.task",
      delegate: "GPU",
    },
    runningMode: "VIDEO",
    outputSegmentationMasks: true,
  });

But the PoseLandmarkerResult I am getting has no segmentationMasks.

This is also the case when i set delegate to CPU.

schmidt-sebastian commented 9 months ago

You won't be able to use the WebGLTexture unless you pass in your own canvas object. This must be the same canvas object that you use to draw the texture (i.e. the "gl" object must be a WebGL2Context that you obtain from this object).

See https://github.com/google/mediapipe/blob/master/mediapipe/tasks/web/vision/core/vision_task_options.d.ts#L34

google-ml-butler[bot] commented 9 months ago

Are you satisfied with the resolution of your issue? Yes No

gtnbssn commented 9 months ago

It looks like the segmentation mask should be obtained from a separate segmentation task, and not from the PoseLandmarker:

https://github.com/google/mediapipe/blob/bb4906bcd36513bd3ba6d948bf98f561a869626b/mediapipe/tasks/web/vision/core/drawing_utils.ts#L485

Could it be that the documentation here isn't up to date?

https://developers.google.com/mediapipe/solutions/vision/pose_landmarker/web_js#configuration_options

Or am I missing something else?

I definitely do not see a segementationMasks in the results of the PoseLandmarker.

NSiggel commented 2 months ago

Any Follow up on this ? Docs seem to imply you can get the segmentation mask from the poseLandmarker.createFromOptions(...) by adding a outputSegmentationMasks: true

https://ai.google.dev/edge/mediapipe/solutions/vision/pose_landmarker/web_js#configuration_options

however despite getting a result.segmentationMasks array back, don't seem to be able to get any mask values? (all 0) and not sure how/if it possible to use the webGL context information.

Documentation and the codepen examples don't seem in sync. (Pose Landmarks missing the presence field as well)

It there a way to get POSE and segmentation data in a single pass? MediaPipe Studio seems to show it working, but not seeing working on the codepen examples.

If so, on which version of the mediapipe release does it work ?

google-ai-edge / mediapipe