w3c / webcodecs

WebCodecs is a flexible web API for encoding and decoding audio and video.
https://w3c.github.io/webcodecs/
Other
953 stars 135 forks source link

Odd sized video issues with NV12 format Video Frames #638

Closed blaz-rupnik closed 11 months ago

blaz-rupnik commented 1 year ago

When Video Decoder creates a VideoFrame with odd sized width or height, the allocationSize on VideoFrame throws an error (We use NV12 format).

Uncaught (in promise) TypeError: Failed to execute 'allocationSize' on 'VideoFrame': Invalid visibleRect. Expected height to be a multiple of 2 in plane 1 for format NV12; was 761. Consider rounding visibleRect inward or outward to a sample boundary.

Mentioned option is to round the visibleRect. I tried rounding up rect as an option added to allocationSize method but I also get an error since y + height cannot be more than the coded Height. Coded height is also readonly inside the VideoFrame so not sure what are my options for resolving this.

Djuffin commented 1 year ago

I opened Chromium issue to track this: https://crbug.com/1413632

Djuffin commented 1 year ago

This spec issue can be closed now.

dalecurtis commented 1 year ago

Hmm, yeah I don't recall why we're applying that to visible rect. Seems like an oversight. The advice here would be to round down and not up (i.e., drop odd row) to 760. We'll look into it more on the Chrome issue.

dalecurtis commented 1 year ago

Ah right. You can only use odd sizes with I444 or ARGB formats not subsampled ones. While the Y plane supports an odd size, the subsampled UV plane requires making a decision on rounding -- so this message is telling you that you need to make that choice. In this case you must round down.

jkosir commented 1 year ago

@dalecurtis YUV formats support odd-sized frames, Y plane is full size and then U and V are rounded up, which effectively results in some 2x2 chroma blocks to not be fully used, the bottom or right (or bottom and right) parts of block are ignored for pixels at the edge. So for a 720x761 frame there's 720x761 bytes in Y plane, 360*381 in U and V planes.

Not sure if this is standardised somewhere but it's a convention in libvpx, imagemagick etc.:

This is also supported in chromium, for a 720x761 VP9 yuv420p video I get a VideoFrame (NV12 or I420) with displayHeight, displayWidth, codedWidth and codedHeight all with the odd-sized dimensions, e.g. 720x761. Such a frame is rendered correctly (using canvasContext2d.drawImage(frame)).

The problem is with frame.copyTo() method, which does not accept odd-sized dimensions in rect parameter, nor does allow dimensions bigger than Y plane. So in this case the maximum rect that I can copy out of frame is 720x760. The copy is therefore missing the data for about one row of pixels of the video frame. Also this seems like the only way to access raw data of the video frame and thus there's simply no way to get this data?

This frame basically has 720x761 (Y) + 360*381 (U) + 360*381 (V) bytes of data while using copyTo I can only access 720x760 (Y) + 360x380 (U) + 360*380 (V) bytes.

It's totally sensible if copyTo() doesn't handle odd sizes of YUV frames, however if such a frame is given by the decoder there's no way to access all of it's data. I don't see anything regarding this in the spec, is this behaviour correct per spec then?

dalecurtis commented 1 year ago

Hmm, I thought that would show up as having a coded size of 720x762, but you're right it doesn't. Using: https://source.chromium.org/chromium/chromium/src/+/main:media/test/data/bear-vp9-odd-dimensions.webm

let v = VideoFrame(document.querySelector('video'));
console.log(v);
VideoFrame {format: 'NV12', timestamp: 0, duration: 33000, codedWidth: 161, codedHeight: 121, …}
codedHeight: 121
codedRect: DOMRectReadOnly {x: 0, y: 0, width: 161, height: 121, top: 0, width: 161, bottom: 121, right: 161}
codedWidth: 161
colorSpace: VideoColorSpace {primaries: 'bt709', transfer: 'bt709', matrix: 'bt709', fullRange: false}
displayHeight: 121
displayWidth: 161
duration: 33000
format: "NV12"
timestamp: 0
visibleRect: DOMRectReadOnly {x: 0, y: 0, width: 161, height: 121, top: 0, width: 161, bottom: 121, right: 161}

I think the spec language will need some work, so reopening this. https://w3c.github.io/webcodecs/#pixel-format

The codedWidth and codedHeight MUST be even.

jkosir commented 1 year ago

In case someone runs into the same issue, the pixel data for entire (odd sized) NV12 Video Frame can be accessed by first converting to ImageBitmap using createImageBitmap then creating a VideoFrame from said image bitmap. This results in a BGRA VideoFrame on which copyTo() then works with odd dimensions.

dalecurtis commented 1 year ago

You might also try drawImage() to canvas with willReadFrequently set and use getImageData().

keithseifert1207 commented 1 year ago

I have run into this issue as well with the reading odd sized i420/nv12 frames that I do not create. Another edge case to consider is when the video frame size is even, in my case 370x210, but that frame size leaves the UV planes as odd, 185x105. In this case copyTo currently appears to function, but the buffer data is not correct.

sandersdan commented 1 year ago

This is related to issue #348, which concerns odd visibleRects. Interestingly libvpx in particular does allow odd coded size, but it also requires extra stride in a way that is equivalent to rounding the coded size up (and not just by 1px). If we actually made tightly-packed odd NV12 frames, we couldn't feed them to libvpx for encoding without copying them first.

I'm okay with standardizing on the definition in https://github.com/w3c/webcodecs/issues/638#issuecomment-1430117008 for copyTo() and new VideoFrame(), especially since (if I recall correctly), the new VideoFrame() implementation is allowed to simply over-allocate and return a frame with a larger codedSize.

The only non-workaroundable issue here seems to be that it's possible for the browser to create frames that can't be fully copied; an easy fix in Chrome would be to simply report a larger coded size (assuming that these frames are all already over-allocated).

keithseifert1207 commented 1 year ago

I fixed this issue using the createImageBitmap that Jakob mentioned. It works as expected with both i420 and nv12 frames. That means the frame has all the correct data, not seeing any visual differences.

For both odd frame cases and odd half frame sizes: allocationSize needs to return the correct required buffer size. copyTo needs to correctly transfer the data in the same fashion as createImageBitmap currently does.

I agree with changing the specification to state that frame size should be even but don't fail when they are not.

sandersdan commented 1 year ago

This turns out to be relatively straightforward, I am preparing a PR for spec text changes.

Proposal

Support odd coded sizes and copy rect for subsampled formats, by rounding the sizes outward to sample boundaries separately in each plane.

Odd offsets for visibleRect and copy rect will remain unsupported.

Risks

Odd sizes are less standard than sample-aligned sizes, increasing the risk of implementation errors and interoperability issues. This change does not actually require WebCodecs implementations to support odd VideoFrames; it is possible to support only copying to and from buffers in these defined formats while creating VideoFrames adhering to stricter rules.

This proposal changes previously exception-generating method calls to be supported operations.

sandersdan commented 1 year ago

Ping @youennf, @padenot for any comments before moving forward with the PR (#666).

padenot commented 1 year ago

I'm sorry I'm behind of lots of things and completely missed this one. I'll get someone internally that implements this to have a look.

sandersdan commented 1 year ago

Ping @padenot