Closed hmaarrfk closed 3 weeks ago
The problem with this patch is that when the fast path is used, it disrupts the structure of the underlying pixel format. Any code that relies on that behavior is going to break. This was discussed in https://github.com/PyAV-Org/PyAV/pull/788
Yes. I remember now. Ultimately this function doesn't give that guarantee. It al4eady changes between a view and a copy.
Would you consider the addition of a keyword argument to specify the users intention?
Any libraries that uses on to_ndarray()
for VideoFrames probably without realizing it, reply on the underlying data format to be preserved. auto-editor, for example, depends on frame.to_ndarray().to_bytes()
to return the underlying bytes so the ffmpeg cli can treat them as raw frames. This allows for efficient rendering videos.
What is the point of to_ndarray()
being fast if it returns data in an undefined layout? Do you have a particular use case?
Any libraries that uses on to_ndarray() for VideoFrames probably without realizing it, reply on the underlying data format to be preserved.
The data "format" hasn't changed.
What has changed is the fact that VideoFrame
and the returned numpy
array share the same memory. If you think this is an issue, i'm happy to add different function, but the in its current form on main to_ndarray
does the following
yuv420p
yuvj420p
-- data copyyuyv422
-- data copygbrp10be", "gbrp12be", "gbrp14be", "gbrp16be", "gbrp10le", "gbrp12le", "gbrp14le", "gbrp16le"
-- data copy"gbrpf32be", "gbrpf32le"
-- data copy"rgb24", "bgr24"
-- data view"gray", "gray8", "rgb8", "bgr8"
-- data viewso rgb24 and gray formats are returning views. I presume it was because it was easier to guarantee correctness then.
I'm asking here that yuv420p
and yuvj420p
return "maybe views" if the underlying image data is well organized (according to the checks in the if statement).
auto-editor, for example, depends on frame.to_ndarray().to_bytes() to return the underlying bytes so the ffmpeg cli can treat them as raw frames.
This should still work. I can add a test for this usecase.
What is the point of to_ndarray() being fast if it returns data in an undefined layout?
I utilize the well defined layout of the YUV420 format in FFMPEG to avoid the memory copy.
For a 8x4 array of pixels, YUV420 and YUV420J would store:
yyyyyyyy
yyyyyyyy
yyyyyyyy
yyyyyyyy
uuuuuuuu
vvvvvvvv
From wikipedia: https://en.wikipedia.org/wiki/Chroma_subsampling
So the "fastpath" checks that the memory is all contiguous, and simply returns a pointer to the underlying data from the FFMPEG VideoFrame
object.
Avoiding this copy is pretty important since for 4k (or 8k! which I am interested in) video decoding:
3840 x 2160 x 1.5 = 12MB
per frame.
Copying this around unnecessarily creates REAL problems with your CPU cache and just slows things down alot.
As for my usecase, it revolves around 4k and 8k video encoding / decoding. And as such benefits from what appears to be micro optimizations.
I can add a test if you want, but generally speaking, I remember that without my check, something already failed. I recall it being due to some odd image shape in the testing suite which was interesting to learn about.
It applying the fastpath does break auto-editor though:
It wouldn't be an issue as long as the fastpath is not applied by default. I'm sympathetic to your use-case. I do think having a new default argument is a little janky, with the **kwargs
already there.
I think the way forward is make a new method to_ndarray2()
. It would handle the yuv420p
yuvj420p
like your code does now. Making it a new method has the benefit of side-stepping any backwards compatibility problems.
It applying the fastpath does break auto-editor though:
It applying the fastpath does break auto-editor though:
This is really strange....
I find on place where you use the "naked" to_ndarray
https://github.com/WyattBlue/auto-editor/blob/8f11ab54fade4fa5ab4cd7cb43ae54a4212dbff3/auto_editor/analyze.py#L392
but nothing jumps to me as getting mutated.
(In fact you could save yourself some time by doing current_frame = frame.to_ndarray().astype('int16')
but that is beside the point)
I would love to recreate that screenshot to see what might be going on?
I have a feeling that maybe I'm not adding enough safeguards on the fastpath.
Can you teach me how you got to your screenshot?
The other place is https://github.com/WyattBlue/auto-editor/blob/8f11ab54fade4fa5ab4cd7cb43ae54a4212dbff3/auto_editor/render/video.py#L370 but also seems "safe"
I suspect the reason that I don't see this bug it is because you are making use of elaborate filters and using filter graphs which is not what workflow i go through.
I added two checks for data continuity:
y_plane._buffer_ptr + y_plane.buffer_size * bytes_per_pixel == u_plane._buffer_ptr and
u_plane._buffer_ptr + u_plane.buffer_size * bytes_per_pixel == v_plane._buffer_ptr
I think they should resolve things, but I need to check that it is still "correct" in the sense that the fast path is still triggered.
I need to recheck my buffer continuity check, it doesn't seem to be triggering "true" on what I thought it should.
Hmmm, it seems that the array isn't guaranteed to be contiguous, even in the "nice" cases
ipdb> u_plane.buffer_ptr - (y_plane.buffer_ptr + y_plane.buffer_size)
4864
ipdb> v_plane.buffer_ptr - (u_plane.buffer_ptr + u_plane.buffer_size)
1728
how is any of my code working.....
it seems that maybe the function i'm looking for is:
get_buffer2
that would allow me to create the buffer that I need to ensure that memory isn't copied.
https://ffmpeg.org/doxygen/6.0/structAVCodecContext.html#aef79333a4c6abf1628c55d75ec82bede
Thank you for your thoroughness in testing!
Thank you for considering.