WebGL access to decoded planes

sandersdan commented 4 years ago

Current proposals provide for accessing decoded frames as WebGL textures, but these would be RGB, implying a conversion for most content. We should investigate whether we can provide access to individual planes.

itsjamie commented 4 years ago

Having access to a decoded plane would be useful for a use case like super-sampling via WebGL.

I'm thinking of a complete flow where it's possible to put a shader based preprocessor to super-res the image, route this back to the WebCodecs exposed web encoder and generate a segment of media ready for MSE to take over for regular playback.

sandersdan commented 4 years ago

There is a proposal in progress for a WebGL extension that would enable this, hopefully it will be available publicly soon.

chcunningham commented 4 years ago

The proposal from Jie @ Intel is now public https://docs.google.com/document/d/1ULhbTWlS_vz3AUKAZXNa0TSlwtFB7ItuiFhlqseLUTU/edit

Chromium bug where prototyping work is tracked is here https://bugs.chromium.org/p/chromium/issues/detail?id=1142754

FYI, there is also a proposal(s) for accessing planes for GPU backed VideoFrames using WebGPU https://docs.google.com/document/d/1MmLTO7gmOBp9fIccrlbqYcGOEydxYKbQwOYBY6gxZ5M/edit#

padenot commented 3 years ago

Any news on this? There no information anywhere on what happens when one does:

  const internalFormat = gl.RGBA;
  const srcFormat = /* something ?! use RGBA because a conversion is implied? How is the color space known? */;
  const srcType = gl.UNSIGNED_BYTE;
  gl.bindTexture(gl.TEXTURE_2D, texture);
  gl.texImage2D(gl.TEXTURE_2D, 0, internalFormat, srcFormat, srcType, videoFrame);

where texture is a texture already initialized and videoFrame a VideoFrame object.

chcunningham commented 3 years ago

I think @dalecurtis html PR intends to address this https://github.com/whatwg/html/pull/6589

Adding VideoFrame as a CanvasImageSource allows drawImage(), createImageBitmap(), and texImage() to interoperate with VideoFrames.

But perhaps additional updates are needed? Where is the behavior defined for earlier types of CanvasImageSource?

For the webGL extensions described earlier in this issue, that work is paused for a bit to focus on the webGPU strategy instead. Updates in the crbug describe a bit more

https://bugs.chromium.org/p/chromium/issues/detail?id=1142754#c21 https://bugs.chromium.org/p/chromium/issues/detail?id=1142754#c45

sandersdan commented 3 years ago

Hmm, these are good questions, and they fall right into an area of GL that I've never fully understood. From a logical standpoint, srcFormat and srcType aren't meaningful for a WebGL CanvasImageSource; the source buffer doesn't need to be described because the CanvasImageSource knows, and in practice it's not usually a block of memory like GL texImage2D() expects.

My mental model is that this acts like a texture-to-texture copy; that is, sample at each coordinate of videoFrame and write that sample to the texture in whatever format the destination texture is in.

In Chrome's implementation, the CanvasImageSource for VideoFrame handles color spaces the same way as the <video> rendering path, which I believe looks like sRGB in WebGL. I'm not actually sure if there is a way to get linear RGB by the texImage2D() path, but that sure would be useful. In the future, when the WebGL implementation supports HDR/WCG, there will be other intermediate color spaces available to the implementation.

(aside: I believe Chris means to say "WebGPU strategy" above.)

chcunningham commented 3 years ago

(aside: I believe Chris means to say "WebGPU strategy" above.)

Correct! I've just made that edit.

chcunningham commented 3 years ago

@kenrussell @domenic to opine.

Background: for now WebCodecs implicitly assumes sRGB. We plan to support more (#47). But, looking just at current support, should the spec (or the HTML PR) say more about how color is handled by canvas.drawImage() and textImage*() APIs?

domenic commented 3 years ago

I think @ccameron-chromium is the expert you need. In https://github.com/whatwg/html/pull/6562 he added color space support to HTML canvas, so we need to at least be compatible with what's there. It looks like the spec currently says

When drawing content to a 2D context, all inputs must be converted to the context's color space before drawing.

and

There do not exist any inputs to a 2D context for which the color space is undefined. The color space for CSS colors is defined in CSS Color. The color space for images that specify no color profile information is assumed to be 'srgb', as specified in the Color Spaces of Untagged Colors section of CSS Color.

which might cover you?

ccameron-chromium commented 3 years ago

My inclination about adding Y/U/V plane access is that if it is to be added, one should request something to effect of "I want the Rec709 Y plane", specifying both the color space and the plane. This is very well-defined, and doesn't depend on any implementation details.

If the video is Rec709, then one can end up in the optimal situation of sampling the Y plane that came out of the decoder. Or one may be sampling a copy that was made (but one that was made by just copying the one plane, which is more efficient). Or perhaps this is on a platform with some sort of workaround where the video came out as something totally different (say, 444, or RGBA), because of some bizarre bug, and one ends up getting a color-converted copy.

This requires that a complementary API be present which can tell the application "this frame is Rec709". That enables the application to select the Y plane format that is going to be the most efficient. (And if the application is sloppy and always says "I want Rec709 Y", and gets a video that is something totally different, then they suffer in terms of performance but not correctness).

chcunningham commented 3 years ago

Triage note: marking 'extension' as the proposed WebGL API adds new interfaces without breaking WebCodecs.

aboba commented 6 months ago

@djuffin @padenot Can we close this issue?

padenot commented 6 months ago

I've proposed that we talk about this at TPAC, in a joint meeting with webgl/webgpu folks. There has been demand from the industry some time ago.

kenrussell commented 6 months ago

@padenot would you email me at kbr at chromium dot org? If we're to advance this feature then I think it would be most productive to do it in the WebGL and WebGPU working group meetings rather than waiting for TPAC. WebGL is hosted at Khronos rather than the W3C and WG members are not currently planning to attend TPAC.

Note that WebGPU's functionality of importing external textures provides a zero-copy way to bring YUV VideoFrames into WebGPU shaders. I wonder whether that path addresses this use case.

akre54 commented 5 months ago

I'm also interested in this from the encoding perspective - writing a VideoFrame currently requires converting to a Uint8Array from gl.readPixels rather than passing a Texture or Framebuffer.

Up until this point I've used a MediaStreamTrackProcessor on the canvas which is generally performant, but I'd like to be able to write the individual passes from a Framebuffer to my VideoFrame, and currently reading from the framebuffer is very slow.

Happy to email separately about this @kenrussell.

kenrussell commented 5 months ago

@akre54 you can construct a VideoFrame from either an HTMLCanvasElement or OffscreenCanvas, using WebGL to render to either of those surfaces. The data will remain on the GPU, though a copy is necessary.

It's not possible to render directly to a VideoFrame from WebGL, nor to copy individual WebGLTextures into a VideoFrame. The browser's media stack and WebGL implementation may be hosted on different graphics APIs (and, in Chrome, currently are, on multiple platforms), so the surfaces that interoperate between the two APIs have to be allocated specially. This is the case for all of the input types to VideoFrame per the MDN documentation, such as HTMLCanvasElement and OffscreenCanvas.

akre54 commented 5 months ago

Ah that's a bummer but thanks for the info.

Up until this point we've been using the canvas for capturing the stream but the goal is to take the separate shader passes and write them out individually with an alpha channel, similar to Blender's render passes, instead of one unified composited image. I'll do some experimentation to see if drawing the individual layers and capturing the ReadableStream value is faster than writing to Framebuffers - I suspect it might be.

w3c / webcodecs

WebGL access to decoded planes #37