gift-surg / GIFT-Grab

An open-source C++ and Python API for acquiring, processing and encoding video streams in real time. Supports several frame-grabber cards, standard-compliant network streams and video files. Python API is compatible with NumPy and SciPy.
BSD 3-Clause "New" or "Revised" License
42 stars 14 forks source link

Support adding frame-level metadata #66

Closed tvercaut closed 3 years ago

tvercaut commented 5 years ago

A few video containers allow the storage of frame-level metadata. FFmpeg allows to store such data in the metadata field of AVFrame as a pointer to a AVDictionary: https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html#a5bde87fd101f66d6263bb451056dba13

An arbitrary string can then be stored using av_dict_set: https://www.ffmpeg.org/doxygen/trunk/group__lavu__dict.html#ga8d9c2de72b310cef8e6a28c9cd3acbbe

Ideally, we would use this to store a serialised protobuf message.

Note that av_frame_set_metadata has been deprecated only because the metadata field is directly accessible: https://github.com/FFmpeg/FFmpeg/commit/7df37dd319f2d9d3e1becd5d433884e3ccfa1ee2

dzhoshkun commented 5 years ago
dzhoshkun commented 5 years ago

Also the Fraunhofer page on HEVC contains potentially useful links

dzhoshkun commented 5 years ago
dzhoshkun commented 5 years ago
dzhoshkun commented 5 years ago
dzhoshkun commented 5 years ago

for some private data of the user

dzhoshkun commented 5 years ago
tvercaut commented 5 years ago

For the record, a potentially relevant post I found: https://stackoverflow.com/questions/39853810/opaque-pointer-in-ffmpeg-avframe?rq=1

Note that there is also a AVFrameSideData field... https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html#a44d40e03fe22a0511c9157dab22143ee

Might be worth asking on the libav-user mailing list.

dzhoshkun commented 5 years ago

Thanks Tom. AVFrameSideData seems to be quite specific, as its type field (of type AVFrameSideDataType does not seem to provide an option for free-form data.

dzhoshkun commented 5 years ago
dzhoshkun commented 5 years ago

The opaque field does not seem to be respected by FFmpeg any more, AVBuffer should be used instead

dzhoshkun commented 5 years ago

opaque_ref does not seem to work either, on top of this, it doesn't seem to be available in all FFmpeg versions (see above failing CI build). But maybe there's a missing step in the code, such as https://github.com/gift-surg/GIFT-Grab/issues/66#issuecomment-477509746

tvercaut commented 5 years ago

For the record, it seems like metadata in this context is also referred to as SEI (Supplemental Enhancement Information).

x265 mentions dynamic user SEI (https://x265.readthedocs.io/en/default/releasenotes.html):

Dynamic metadata may be either supplied as a bitstream via the userSEI field of x265_picture, or as a json jile that can be parsed by x265 and inserted into the bitstream; use --dhdr10-info to specify json file name, and --dhdr10-opt to enable optimization of inserting tone-map information only at IDR frames, or when the tone map information changes.

Potentially useful discussion threads:

tvercaut commented 5 years ago

Also of interest: Overview of HEVC High-Level Syntax and Reference Picture Management https://doi.org/10.1109/TCSVT.2012.2223052

dzhoshkun commented 5 years ago

In https://doi.org/10.1109/TCSVT.2012.2223052 a "User data registered / User data unregistered" SEI message type is listed, which was inherited from the H.264/AVC standard. This might be what we are looking for:

dzhoshkun commented 5 years ago

Further potentially useful links:

dzhoshkun commented 5 years ago
joubs commented 3 years ago

Closing as out of date.