Open dericed opened 6 years ago
Can you give us examples of what kind of data you might use this for? We already have a system for associating arbitrary/predefined data, it's called Tags, and I'm hesitant to add yet another one. Of course the granularity of tags is worse at the moment as you can only associate them down to the track level but not the frame level.
Tags are not suitable for info per frame, more adapted for info per track as a whole, am I wrong?
Example of side metadata per frame: EBU Tech 3349, or Cooke protocol (summary in MediaInfo library). Lens unit metadata / camera metadata with numbers changing often: iris position, zoom value. Note that you sometimes don't have this info when starting to store the file e.g. real time capture. In MXF is mixed with text / timecode / etc in an unique ancillary data track, issue with such method is that several items are in a single track and there may have sometimes synchronization issues (especially when remuxing).
A transcoder would be requested to transfer such metadata as is when compressing the video part.
Thanks for the example, Jérôme. Yeah, it makes sense having it interleaved together with the frame data. We would need something like this, I guess:
BlockGroup
+- Metadata (master, 0..n times)
+- MetadataLaceNumber (unsigned integer, 1 time, default 0, states that the following key/value pair applies to the n-th laced frame in this BlockGroup)
+- MetadataName (ASCII string (I'd prefer ASCII in order to correspond to tags); 1 time)
+- MetadataString (Unicode string, 0..1 time)
+- MetadataBinary (binary, 0..1 time)
It's similar to the tagging system but simplified. Of course we could make it more complicated, the structure deeper (e.g. one master inside the BlockGroup
that contains all Metadata
sub-masters), the lace number multiple (similar to Targets
inside a Tag
), but I'd prefer to keep it simple and shallow, especially as I don't see the need for the flexibility the tags' targeting system provide.
A SimpleBlock
would not be able to contain such metadata. That's why it's called simple.
@mbunkus I used your notes to start a draft in #276
Thinking if perhaps rather than a new Master Element, if BlockAdditions
should be extended for this type of data.
I add my praise to this feature request. Is work on this currently stalled?
thank you @ceztko, yes it's apparently stalled but I'm still very interested in it and it's been on my mind to complete this part. Do you mind sharing what interests you about the request or how you would envision using such a feature?
Do you mind sharing what interests you about the request
My use case would be re-encoding a WebRTC video stream and add an absolute NTP capture timestamp to each frame. NTP timestamps differ from a presentation timestamp in the sense it may (and will) have different time base and different starting value and can be used to synchronize with external/stream unrelated events.
how you would envision using such a feature?
The proposal from @mbunkus looks good to me but I would simplify it even more by having just a MetadataValue
with cardinality 1 (and not one of MetadataString
or MetadataBinary
) to simplify life to encoder libraries devs and removing possible aliasing in the actual implementation: being fully opaque the data can be unambiguously read and set. For example ffmpeg currently exposes a (mostly unused) per-frame metadata interface in the AVFrame[1] in form of an dictionary of AVDictionaryEntry[2] that handle only strings. This interface can be easily fixed to support both strings and arbitrary content by adding a size field to the entry and that would perfectly fit having just the single 'MetadataValue' entry.
[1] https://github.com/FFmpeg/FFmpeg/blob/ab4795a085cd3deacb5e4bbaf527d66171361024/libavutil/frame.h#L581 [2] https://github.com/FFmpeg/FFmpeg/blob/ab4795a085cd3deacb5e4bbaf527d66171361024/libavutil/dict.h#L81
Moving to the codec spec as #287 added the parts needed in the main spec. And #353 is for the codec document.
I suggest a method to transit/store frame side metadata in Matroska (arbitrary or pre-defined). There is a similar mechanism defined in nut (see the 'sm_data / side_data / meta_data' section of http://ffmpeg.org/~michael/nut.txt). I looked at BlockAdditional but it doesn't seem to provide enough structure for this purpose.