Mandating the presence of obu_size (as required by the "low overhead bitstream" format)
Mandating the presence of OBU_TEMPORAL_DELIMITER and OBU_REDUNDANT_FRAME_HEADER
The goal is to provide:
The least difference with the bitstream expected/provided by other AV1 handlers (software or hardware)
While still providing easy access through/accross OBU (i.e. framing)
Specificities of AV1 bitstream
Unlike most video codec bitstreams, the AV1 specification has provisioned a flag (obu_has_size_field) in OBU headers and a variable-length field (obu_size) to be able to specify the size of the OBU payload.
This allows elements and hardware that process AV1 bitstream to easily skip/scan through OBU without requiring any other form of packing provided the container format specifies the beginning of one OBU.
This "size" feature is not present in any other major video codec bitstream, explaining why they have to resort to using a "startcode-based" system and provisioned their bitstream to support it (by having "emulation-prevention" bytes within their bitstream).
Lower overhead
The obu_size feature of AV1 bitstream provides a more compact bitstream than the "startcode-based" proposal:
Specifying the presence of a obu_size has no cost (flag included in header)
The obu_size field, being a leb128, uses less space on average than the mandatory 4 bytes of a startcode
1 byte for up to (2^7) 128 bytes payload, for Temporal Delimiter, Frame Header and small metadata
2 bytes for up to (2^14) 16kB payload
3 bytes for up to (2^21) 2MB payload
4 bytes (equivalent to 4 bytes of startcode) allows handling of (2^28) 256MB payload
More seems unlikely for now
The "startcode-based" format also requires modifying the bitstream to insert emulation-prevention bytes where needed, further increasing the payload.
Not scanning whole bitstream
Due to having the OBU size specified in the bitstream, this also allows direct seeking/skipping over the OBUs, instead of scanning for a startcode.
The only requirement for this is for the container to specify where a single OBU starts, which is easily done by mandating that the AV1 PES payload starts with a OBU header.
Compatibilty with existing hardware and software
Existing hardware and software that are fully compliant with the AV1 specification would require extra processing in order to be compatible with the proposed "start-code" format:
Full parsing of the bitstream in order to insert/remove the emulation bytes
Re-computation of the presence of OBU_TEMPORAL_DELIMITER and OBU_REDUNDANT_FRAME_HEADER
Note: The current av1-mpegts specification doesn't specify when/how they should be removed or re-inserted.
While less complex than the emulation-byte handling, the proposed "startcode-based" framing does not mandate the presence of obu_size (which the standard "Low Overhead bitstream format" mandates). This would also require re-computing the OBU header to re-insert (or remove) that mandatory obu_size.
Using the "Low overhead bitstream format" from the base specification avoids this complexity overhead and avoids potential issues/pitfalls when transforming the bitstream.
Informational : Why the Annex B "Length-delimited Bitstream Format" is not suitable
While tempting and slightly less complex, the Annex B formatting requires handling at the "Temporal Unit" level, which is not compatible with the proposed Access Unit PES framing which is at the "Frame Unit" level.
Creating such a bitstream would require accumulating the various "Frame Unit" in a "Temporal Unit" in order to compute the temporal_unit_size, introducing excessive latency.
A "start-code based" framing, and other requirements, was introduced by MR #5
This RFC is to discuss:
obu_size
(as required by the "low overhead bitstream" format)OBU_TEMPORAL_DELIMITER
andOBU_REDUNDANT_FRAME_HEADER
The goal is to provide:
Specificities of AV1 bitstream
Unlike most video codec bitstreams, the AV1 specification has provisioned a flag (
obu_has_size_field
) inOBU
headers and a variable-length field (obu_size
) to be able to specify the size of theOBU
payload.This allows elements and hardware that process
AV1
bitstream to easily skip/scan throughOBU
without requiring any other form of packing provided the container format specifies the beginning of oneOBU
.This "size" feature is not present in any other major video codec bitstream, explaining why they have to resort to using a "startcode-based" system and provisioned their bitstream to support it (by having "emulation-prevention" bytes within their bitstream).
Lower overhead
The
obu_size
feature ofAV1
bitstream provides a more compact bitstream than the "startcode-based" proposal:obu_size
has no cost (flag included in header)obu_size
field, being aleb128
, uses less space on average than the mandatory 4 bytes of astartcode
Temporal Delimiter
,Frame Header
and small metadataThe "startcode-based" format also requires modifying the bitstream to insert emulation-prevention bytes where needed, further increasing the payload.
Not scanning whole bitstream
Due to having the OBU size specified in the bitstream, this also allows direct seeking/skipping over the OBUs, instead of scanning for a startcode.
The only requirement for this is for the container to specify where a single OBU starts, which is easily done by mandating that the AV1 PES payload starts with a OBU header.
Compatibilty with existing hardware and software
Existing hardware and software that are fully compliant with the AV1 specification would require extra processing in order to be compatible with the proposed "start-code" format:
OBU_TEMPORAL_DELIMITER
andOBU_REDUNDANT_FRAME_HEADER
While less complex than the emulation-byte handling, the proposed "startcode-based" framing does not mandate the presence of
obu_size
(which the standard "Low Overhead bitstream format" mandates). This would also require re-computing the OBU header to re-insert (or remove) that mandatoryobu_size
.Using the "Low overhead bitstream format" from the base specification avoids this complexity overhead and avoids potential issues/pitfalls when transforming the bitstream.
Informational : Why the Annex B "Length-delimited Bitstream Format" is not suitable
While tempting and slightly less complex, the Annex B formatting requires handling at the "Temporal Unit" level, which is not compatible with the proposed Access Unit PES framing which is at the "Frame Unit" level.
Creating such a bitstream would require accumulating the various "Frame Unit" in a "Temporal Unit" in order to compute the
temporal_unit_size
, introducing excessive latency.