PyAV-Org / PyAV

Pythonic bindings for FFmpeg's libraries.
https://pyav.basswood-io.com/
BSD 3-Clause "New" or "Revised" License
2.47k stars 360 forks source link

Expose `av_frame_make_writable` #1414

Closed abextm closed 3 months ago

abextm commented 4 months ago

Overview

I'm currently decoding some vp9 video, modifying it's frames, and writing that back out. Because I am modifying the plane data in place, it corrupts subsequent frames since the decoder keeps previous frames around so it can apply inter-frame (de)compression. FFmpeg provides av_frame_make_writable to support this use case (basically it just copies the plane data if you don't own it)

Existing FFmpeg API

av_frame_make_writable

Expected PyAV API

I would expect a frame.make_writable()

Example:

for frame in cont.decode(video=0):
    frame.make_writable()
    modify_frame(frame)

Investigation

I called av_frame_make_writable with ctypes, which resolves the problem

Reproduction

import av
import ctypes
import numpy

# this is horrible -- do not do this
class ctAVFrame(ctypes.Structure):
    _fields_=[
        ("_ob_head", ctypes.c_byte * object.__basicsize__),
        ("vtable", ctypes.c_void_p),
        ("ptr", ctypes.c_void_p),
    ]
_av_frame_make_writable = ctypes.CDLL(av.video.frame.__file__).av_frame_make_writable
_av_frame_make_writable.argtypes = (ctypes.c_void_p, )
_av_frame_make_writable.restype = ctypes.c_int
def av_frame_make_writable(frame: av.VideoFrame):
    _av_frame_make_writable(ctypes.cast(id(frame), ctypes.POINTER(ctAVFrame)).contents.ptr)

in_cont = av.open("./input.webm")

out_cont = av.open(f"output.mkv", "w")
tmpl = in_cont.streams.video[0].codec_context
out_stream = out_cont.add_stream("libx264", tmpl.rate)
out_stream.options["preset"]="ultrafast"
out_stream.options["crf"]="28"
out_stream.width = tmpl.width
out_stream.height = tmpl.height
out_stream.pix_fmt = tmpl.pix_fmt
out_stream.thread_type = "AUTO"

for i, frame in enumerate(in_cont.decode(video=0)):
    if i % 20 == 0:
        # without this there is significant error
        # av_frame_make_writable(frame)
        numpy.frombuffer(frame.planes[0]).fill(0)

    out_cont.mux(out_stream.encode(frame))

    if i > 2000:
        break

out_cont.mux(out_stream.encode(None))
out_stream.close()

I was running this on a vp9 video from yt-dlp (though I would expect this to happen to anything with interframe compression). With the fix commented out there is significant amounts of error in the output.

Versions

Additional context

It would be nice if av_frame_clone was exposed too, though I'm not sure if exposing it as .clone() makes particular sense, since it does not clone the actual plane data

moonsikpark commented 3 months ago

@abextm What use case do you expect if av_frame_clone() is exposed?

abextm commented 3 months ago

In my case I was doing detection on some video and if the frame hit a case it would write it to an output stream. If it only was a marginal hit, I would also write it to a second output with some debugging data overlaid on top of it. Without av_frame_clone order becomes important because I have to write the unmodified frame before the drawing onto debug frame, which is sort of annoying. With av_frame_clone I could just clone, make_writable, then do whatever without affecting the non-debug feed.