Open sandersdan opened 4 years ago
Having access to a decoded plane would be useful for a use case like super-sampling via WebGL.
I'm thinking of a complete flow where it's possible to put a shader based preprocessor to super-res the image, route this back to the WebCodecs exposed web encoder and generate a segment of media ready for MSE to take over for regular playback.
There is a proposal in progress for a WebGL extension that would enable this, hopefully it will be available publicly soon.
The proposal from Jie @ Intel is now public https://docs.google.com/document/d/1ULhbTWlS_vz3AUKAZXNa0TSlwtFB7ItuiFhlqseLUTU/edit
Chromium bug where prototyping work is tracked is here https://bugs.chromium.org/p/chromium/issues/detail?id=1142754
FYI, there is also a proposal(s) for accessing planes for GPU backed VideoFrames using WebGPU https://docs.google.com/document/d/1MmLTO7gmOBp9fIccrlbqYcGOEydxYKbQwOYBY6gxZ5M/edit#
Any news on this? There no information anywhere on what happens when one does:
const internalFormat = gl.RGBA;
const srcFormat = /* something ?! use RGBA because a conversion is implied? How is the color space known? */;
const srcType = gl.UNSIGNED_BYTE;
gl.bindTexture(gl.TEXTURE_2D, texture);
gl.texImage2D(gl.TEXTURE_2D, 0, internalFormat, srcFormat, srcType, videoFrame);
where texture
is a texture already initialized and videoFrame
a VideoFrame
object.
I think @dalecurtis html PR intends to address this https://github.com/whatwg/html/pull/6589
Adding VideoFrame as a CanvasImageSource allows drawImage(), createImageBitmap(), and texImage() to interoperate with VideoFrames.
But perhaps additional updates are needed? Where is the behavior defined for earlier types of CanvasImageSource?
For the webGL extensions described earlier in this issue, that work is paused for a bit to focus on the webGPU strategy instead. Updates in the crbug describe a bit more
https://bugs.chromium.org/p/chromium/issues/detail?id=1142754#c21 https://bugs.chromium.org/p/chromium/issues/detail?id=1142754#c45
Hmm, these are good questions, and they fall right into an area of GL that I've never fully understood. From a logical standpoint, srcFormat
and srcType
aren't meaningful for a WebGL CanvasImageSource
; the source buffer doesn't need to be described because the CanvasImageSource
knows, and in practice it's not usually a block of memory like GL texImage2D()
expects.
My mental model is that this acts like a texture-to-texture copy; that is, sample at each coordinate of videoFrame
and write that sample to the texture in whatever format the destination texture is in.
In Chrome's implementation, the CanvasImageSource
for VideoFrame
handles color spaces the same way as the <video>
rendering path, which I believe looks like sRGB in WebGL. I'm not actually sure if there is a way to get linear RGB by the texImage2D()
path, but that sure would be useful. In the future, when the WebGL implementation supports HDR/WCG, there will be other intermediate color spaces available to the implementation.
(aside: I believe Chris means to say "WebGPU strategy" above.)
(aside: I believe Chris means to say "WebGPU strategy" above.)
Correct! I've just made that edit.
@kenrussell @domenic to opine.
Background: for now WebCodecs implicitly assumes sRGB. We plan to support more (#47). But, looking just at current support, should the spec (or the HTML PR) say more about how color is handled by canvas.drawImage() and textImage*() APIs?
I think @ccameron-chromium is the expert you need. In https://github.com/whatwg/html/pull/6562 he added color space support to HTML canvas, so we need to at least be compatible with what's there. It looks like the spec currently says
When drawing content to a 2D context, all inputs must be converted to the context's color space before drawing.
and
There do not exist any inputs to a 2D context for which the color space is undefined. The color space for CSS colors is defined in CSS Color. The color space for images that specify no color profile information is assumed to be 'srgb', as specified in the Color Spaces of Untagged Colors section of CSS Color.
which might cover you?
My inclination about adding Y/U/V plane access is that if it is to be added, one should request something to effect of "I want the Rec709 Y plane", specifying both the color space and the plane. This is very well-defined, and doesn't depend on any implementation details.
If the video is Rec709, then one can end up in the optimal situation of sampling the Y plane that came out of the decoder. Or one may be sampling a copy that was made (but one that was made by just copying the one plane, which is more efficient). Or perhaps this is on a platform with some sort of workaround where the video came out as something totally different (say, 444, or RGBA), because of some bizarre bug, and one ends up getting a color-converted copy.
This requires that a complementary API be present which can tell the application "this frame is Rec709". That enables the application to select the Y plane format that is going to be the most efficient. (And if the application is sloppy and always says "I want Rec709 Y", and gets a video that is something totally different, then they suffer in terms of performance but not correctness).
Triage note: marking 'extension' as the proposed WebGL API adds new interfaces without breaking WebCodecs.
@djuffin @padenot Can we close this issue?
I've proposed that we talk about this at TPAC, in a joint meeting with webgl/webgpu folks. There has been demand from the industry some time ago.
@padenot would you email me at kbr at chromium dot org? If we're to advance this feature then I think it would be most productive to do it in the WebGL and WebGPU working group meetings rather than waiting for TPAC. WebGL is hosted at Khronos rather than the W3C and WG members are not currently planning to attend TPAC.
Note that WebGPU's functionality of importing external textures provides a zero-copy way to bring YUV VideoFrames into WebGPU shaders. I wonder whether that path addresses this use case.
I'm also interested in this from the encoding perspective - writing a VideoFrame currently requires converting to a Uint8Array from gl.readPixels
rather than passing a Texture or Framebuffer.
Up until this point I've used a MediaStreamTrackProcessor
on the canvas which is generally performant, but I'd like to be able to write the individual passes from a Framebuffer to my VideoFrame, and currently reading from the framebuffer is very slow.
Happy to email separately about this @kenrussell.
@akre54 you can construct a VideoFrame
from either an HTMLCanvasElement
or OffscreenCanvas
, using WebGL to render to either of those surfaces. The data will remain on the GPU, though a copy is necessary.
It's not possible to render directly to a VideoFrame
from WebGL, nor to copy individual WebGLTextures
into a VideoFrame
. The browser's media stack and WebGL implementation may be hosted on different graphics APIs (and, in Chrome, currently are, on multiple platforms), so the surfaces that interoperate between the two APIs have to be allocated specially. This is the case for all of the input types to VideoFrame
per the MDN documentation, such as HTMLCanvasElement
and OffscreenCanvas
.
Ah that's a bummer but thanks for the info.
Up until this point we've been using the canvas for capturing the stream but the goal is to take the separate shader passes and write them out individually with an alpha channel, similar to Blender's render passes, instead of one unified composited image. I'll do some experimentation to see if drawing the individual layers and capturing the ReadableStream value is faster than writing to Framebuffers - I suspect it might be.
Current proposals provide for accessing decoded frames as WebGL textures, but these would be RGB, implying a conversion for most content. We should investigate whether we can provide access to individual planes.