PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.
https://pyav.basswood-io.com/
BSD 3-Clause "New" or "Revised" License
2.54k stars 366 forks source link

Make VideoFrame.from_numpy_buffer support buffers with padding #1635

Closed davidplowman closed 2 weeks ago

davidplowman commented 2 weeks ago

This is the last PR after which Raspberry Pi should be able to revert back to the standard mainline PyAV distribution.

To explain the background for this PR:

The Raspberry Pi camera stack uses lots of hardware accelerators, and these have alignment requirements for the start of every image row. This means that the images it spits out normally have padding at the end of each row. libav is perfectly capable of using these buffers directly so long as we set its linesize values correctly. That's what most of this PR is about.

VideoFrame.from_numpy_buffer is extended to accept buffers where the pixel rows are contiguous, but the image as a whole (because of the padding on the end) isn't. I'm detecting contiguous pixel rows by looking at the array's strides. The last one or two values here allow us to deduce this. We very much prefer from_numpy_buffer over from_ndarray because from_ndarray copies image buffers - and these operations are really expensive for us.

I've also taken over the calculation of the linesizes directly, rather than calling another function. The values can be deduced pretty much directly from the array strides in all cases, and this solves the problem where, when there's padding, the array width (times bytes_per_pixel) is no longer the correct value - the underlying stride always is.

There's one final gotcha, which is yuv420p images (which are common for us). Here too we usually have padding. For the Y channel, the padding is at the end (as normal). But for the UV rows (the bottom third of the buffer), half the padding appears in the middle of the buffer row, and the other half at the end. This means applications can't create a "view" on this buffer which omits the padding (as they can with the RGB buffer types).

Instead, we have to let applications pass in the buffer with all the padding, and give us a separate width parameter which tells us what's usable. Obviously I've made width optional, defaulting to the previous behaviour. Again, so long as libav has the correct width and linesize values, it handles these buffers correctly.

I've checked that to_ndarray works correctly on all these buffers (by removing the padding), and added quite a few extra tests to check that these new types of buffer are behaving as expected. I also added support for 32-bit RGBA types, which again are quite common for Raspberry Pi users.

I hope that all makes sense! If there's anything to discuss, or which you'd like me to look at again, I'm of course very happy to do so.

Thanks very much for your help in getting our various changes merged into mainline PyAV!

davidplowman commented 2 weeks ago

@WyattBlue Thanks very much! Just one last question - do you have a timetable yet for when the next PyAV release will be? No hurry, it's just so that I know when to keep an eye open so that I can release the necessary changes at the Raspberry Pi end.

WyattBlue commented 2 weeks ago

PyAV 14 will be released in late December-January