google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.
https://ai.google.dev/edge/mediapipe
Apache License 2.0
26.85k stars 5.09k forks source link

[FEATURE REQUEST] ImageSource should allow WebGLTexture as input #5064

Open marcusx2 opened 8 months ago

marcusx2 commented 8 months ago

Have I written custom code (as opposed to using a stock example script provided in MediaPipe)

No

OS Platform and Distribution

MacOS 13.6.3

MediaPipe Tasks SDK version

0.10.9

Task name (e.g. Image classification, Gesture recognition etc.)

Image classification

Programming Language and version (e.g. C++, Python, Java)

Javascript

Describe the actual behavior

ImageSource does not support WebGLTexture as input

Describe the expected behaviour

ImageSource should support WebGLTexture as input

Standalone code/steps you may have used to try to get what you need

ImageSource

does not support WebGLTexture as input. It looks like I have to convert it first using something like this https://stackoverflow.com/a/18804083/8509272 to be able to use mediapipe with webgl textures. Or am I missing something?

In addition, the ObjectDetectorOptions interface: https://developers.google.com/mediapipe/api/solutions/js/tasks-vision.objectdetectoroptions

is said to extend VisionTaskOptions and ClassifierOptions, but both are undocumented.

Other info / Complete Logs

No response

marcusx2 commented 8 months ago

Is there an alternative way to deal WebGLTexture as input that doesn't require conversion?

schmidt-sebastian commented 8 months ago

Your best bet is to convert the WebGLTexture to an ImageBitmap for now.

marcusx2 commented 8 months ago

Any particular reason for creating an imageBitmap if I can use an imageData? Is it more efficient? Seems like to use an imageBitmap I have to use an imageData first, so it's an extra step.

Also would it be possible to use DrawingUtils with the imageBitmap or imageData?

schmidt-sebastian commented 8 months ago

You should be able to transfer GPU-to-GPU with an ImageBitmap: https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas/transferToImageBitmap. This is much cheaper than creating a CPU based ImageData.

Yes, DrawingUtils supports GPU and CPU based masks.

marcusx2 commented 8 months ago

Last question, I have something like this

GLctx.bindTexture(GLctx.TEXTURE_2D, GL.textures[texturePtr]);
GLctx.texImage2D(GLctx.TEXTURE_2D, 0, GLctx.RGBA, GLctx.RGBA, GLctx.UNSIGNED_BYTE, source);

How can I use the transferToImageBitmap at this point? I already have the gl context, I can't create a new one with offscreen.getContext("webgl")(also it looks like getContext("webgl") isn't supported on Safari?) Thanks a lot!

schmidt-sebastian commented 8 months ago

Once you draw the texture to a canvas you should be able to use the transferToImageBitmap() function.

marcusx2 commented 8 months ago

So I have to create a dummy canvas first? With

var canvas = document.createElement('canvas');
canvas.width = width;
canvas.height = height;
var context = canvas.getContext('2d');

And then I do

// Copy the pixels to a 2D canvas
var imageData = context.createImageData(width, height);
imageData.data.set(data);
context.putImageData(imageData, 0, 0);

After the above, I transfer to image bitmap ? Essentially, it's these steps.

Create dummy canvas -> put on image data -> transfer to image bitmap.

Thanks a lot for your help.

EDIT

Or actually create an offscreencanvas instead of a normal canvas.

tyrmullen commented 8 months ago

A WebGLTexture effectively only works for a specific canvas. That's why it's not accepted as an input yet-- we would need to instead be given a pair like (WebGLTexture, canvas-or-webgl2-context), and some extra work would be needed to make that function.

If you are starting with a WebGLTexture, then your best options are to first draw the WebGL texture into your canvas and then either:

You can do this drawing to canvas by using using WebGL calls to run a simple copy/passthrough shader. Don't try to draw WebGL textures to your canvas using ImageData. I'm not sure that will work at all, but if it does it will bring things back to CPU and be slow.

Also, the stack overflow post linked to does a lot more work (in order to create an HTMLImageElement), so you do not need/want to do all that.

marcusx2 commented 8 months ago

Thanks for the clarification tyrmullen .

Yes, the idea is to use media pipe with something like PlayCanvas. If would be very helpful to have this example implemented in PlayCanvas. The idea would be to use mediapipe to work on the PlayCanvas textures, which resides in the PlayCanvas' canvas like you said.

Check this example project that I created, it has the same 2 images of the mediapipe example. Unfortunately, my knowledge with shaders and whatnot is non-existent, so I don't know how to make this work...

I'll leave this feature request then...to make the mediapipe functions accept [WebGLTexture, canvas-or-webgl2-context] pairs as input...

@danrossi

marcusx2 commented 8 months ago

Hey @schmidt-sebastian , is this feature hard to add, or is there any ETA? I'd love to see this sooner rather than later.

danrossi commented 5 months ago

I have webgl render working by transferring image bitmap from the mediapipe offline canvas. I am doing the webgl mix on the offline canvas then printing back to the display canvas For Safari however it needs a different slower method until it supports this, it has to be a cpu draw.

const maskImage = this.tasksCanvas.transferToImageBitmap();

      this.canvasCtx.transferFromImageBitmap(maskImage);

      results.confidenceMasks.forEach((mask) => {
        mask.close();
      });
const maskImage = await createImageBitmap(this.tasksCanvas),
      canvasCtx = this.canvasCtx;

      canvasCtx.save()
      canvasCtx.fillStyle = 'black'
      canvasCtx.clearRect(0, 0, video.videoWidth, video.videoHeight)

      canvasCtx.drawImage(
          maskImage,
          0,
          0,
          video.videoWidth,
          video.videoHeight
      )

      canvasCtx.restore()
      results.confidenceMasks.forEach((mask) => {
        mask.close();

In the mix section I get the texture from the results to use as a mix mask

 gl.clearColor(1.0, 1.0, 1.0, 1.0);
        gl.useProgram(this.prog);
        gl.clear( gl.COLOR_BUFFER_BIT | gl.DEPTH_BUFFER_BIT );
        gl.viewport(0, 0, video.videoWidth, video.videoHeight);

        const texture = results.confidenceMasks[0].getAsWebGLTexture();

        this.bindBuffers(gl, this.positionLocation, this.texCoordLocation);

        gl.activeTexture(gl.TEXTURE0);
        gl.bindTexture(gl.TEXTURE_2D, this.bgTexture);

        gl.activeTexture(gl.TEXTURE2);
        gl.bindTexture(gl.TEXTURE_2D, texture);
        gl.uniform1i(this.uniformLocations.mask, 2);

        gl.activeTexture(gl.TEXTURE1);
        gl.bindTexture(gl.TEXTURE_2D, this.videoTexture);
        gl.uniform1i(this.uniformLocations.frame, 1);

        gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGB, gl.RGB, gl.UNSIGNED_BYTE, video);
        gl.drawArrays(gl.TRIANGLE_FAN, 0, 4);

In the shader

outColor = mix(bgTex, vec4(frameColor, 1.0), maskVal);

I'm curious if it can support WebGPU and be updated to WebGPU so I can do the same in WebGPU it's super efficient.

marcusx2 commented 5 months ago

@danrossi Sorry I didn't understand. You already start with a canvas and use transferToImageBitmap. Do you know how to draw a WebGLTexture to another Canvas like @tyrmullen mentioned? Sorry I'm having a really hard time with this, I'm scouring the internet but can't find a solution, I don't understand shaders. Thanks!

danrossi commented 5 months ago

You send mediapipe an offscreen canvas. Then do the video and background mixing from that webgl context which is the same context mediapipe is using, example showed above, then transferbitmap of the offline canvas back to the display canvas. Safari cant do transferbitmap so needs a cpu draw. Its the lowest resource rendering option I could come up with with no dropped frames.

const tasksCanvas = this.tasksCanvas = new OffscreenCanvas(1, 1);
const wasm = await FilesetResolver.forVisionTasks(
                'https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm'
            );

            this.segmenter = await ImageSegmenter.createFromOptions(wasm, {
                baseOptions: {
                    modelAssetPath:
                        //"https://storage.googleapis.com/mediapipe-models/image_segmenter/deeplab_v3/float32/1/deeplab_v3.tflite",
                        'https://storage.googleapis.com/mediapipe-models/image_segmenter/selfie_segmenter_landscape/float16/latest/selfie_segmenter_landscape.tflite',
                    delegate: "GPU"
                },
                canvas: this.tasksCanvas,
                runningMode: "VIDEO",
                outputConfidenceMasks: true
            });     
 On the same gl context as mediapipe you change the program to the mixing program, to do the mixing on the offscreen canvas 
 ```
  gl.useProgram(this.prog);
  ```
danrossi commented 5 months ago

You can see I've integrated this into PlayCanvas which also supports chromakey. I migrated it from my webrtc features. It was complicated to get it up live sadly. the module option isnt bundling correct so need to look into it.

For PlayCanvas you can set it up with modules but can load as a script asset. Which needs an external mediapipe bundle included on the page as I thought it would be better an external reference. As far as iffe bundles go with mediapipe, the cdn one documented is broken still complaining about missing export. You have make a bundle yourself from the npm import/export.

You cant share textures across webgl contexts. So all the rendering work is done in the offscreen canvas sent to mediapipe. Once a mask is returned its mixed in shaders on the offscreen context. And the display canvas gets the bitmap from the offscreen canvas. The display canvas is added to PlayCanvas as a texture like a video.

I have made a feature request to be able to reuse the video texture for mixing after with the mask texture. The video texture would be created inside mediapipe on the offscreen canvas webgl context.

https://electroteque.org/plugins/playcanvas/virtual-background/ https://electroteque.org/plugins/playcanvas/virtual-background/demos/chromakey/ https://electroteque.org/plugins/playcanvas/virtual-background/demos/webcam/