Closed fcartegnie closed 1 year ago
The group is finally considering this issue. Sorry for the delay. We will sync with the video group to determine if the sentence you quote is a hard requirement of the OBU order.
Note that the current AV1-ISOBMFF spec says:
The configOBUs field contains zero or more OBUs. Any OBU may be present provided that the following procedures produce compliant AV1 bitstreams: From any sync sample, an AV1 bitstream is formed by first outputting the OBUs contained in the AV1CodecConfigurationBox and then by outputing all OBUs in the samples themselves, in order, starting from the sync sample.
So if we consider an example of configOBUs only containing metadata (no SH), the concatenation procedure would have to be changed.
In a following paragraph the current specification goes on to say:
One or more metadata and padding OBUs may appear in any order within an OBU sequence (unless constrained by semantics provided elsewhere in this specification). Specific metadata types may be required or recommended to be placed in specific locations, as identified in their corresponding definitions.
The normative decoder doesn't impose any ordering restrictions.
At least the current av1-hdr10plus draft violates that ordering: https://aomediacodec.github.io/av1-hdr10plus/
In particular, it places the metadata OBU before the first shown frame, but not before all frames in the temporal unit.
At least the current av1-hdr10plus draft violates that ordering
Can you elaborate? av1-hdr10plus seems compliant to both the text in 7.5 and the text that @jzern quoted.
It violates the order given in the first sentence in 7.5 (if it's interpreted strictly), but not jzern's following text. In particular, metadata OBUs are place after frame header OBUs in some cases, e.g. TU1 in https://aomediacodec.github.io/av1-hdr10plus/obu_tu.png
As discussed in the group during the call, I just wanted to clarify a use case. Consider a muxer that detects that a Metadata OBU is common to all frames in a track and decides to store it out-of-band, i.e. in the configOBUs
field. Per the current spec, this is possible given:
The configOBUs field contains zero or more OBUs. Any OBU may be present provided that the following procedures produce compliant AV1 bitstreams:
- From any sync sample, an AV1 bitstream is formed by first outputting the OBUs contained in the AV1CodecConfigurationBox and then by outputing all OBUs in the samples themselves, in order, starting from the sync sample.
Because a sync sample starts with a SH OBU, this would mean that the decoder would be fed: Metadata OBU, SH OBU, ... Thus the need to say if this possible.
Note that if the configOBUs
also contained a SH OBU, the decoder would receive SH OBU, Metadata OBU, SH OBU, ... . In this latter case, presumably the 2nd SH OBU is a redundant version of the first one.
Based on our understanding, we agree that the video specification could be less ambiguous but the intent is to allow metadata to be anywhere (e.g. can be seen at the sequence level, or at the frame level) and therefore this AV1-ISOBMFF specification is not in conflict with the AV1 video specification. Reopen the issue if you disagree.
2.4
Some metadata is non global and related to each frame (many if sync point w/ non visible).
AV1 7.5
There's no schema like with mpeg codecs, but to me that enumeration is an OBU order. No Metadata can then be before sequence header.