AOMediaCodec / av1-mpeg2-ts

Official specification of the AOM group for the carriage of AV1 in MPEG-2 Transport Stream
https://aomediacodec.github.io/av1-mpeg2-ts/
30 stars 5 forks source link

RFC: Switch from start-code framing to Low overhead bitstream format #43

Open bilboed opened 1 year ago

bilboed commented 1 year ago

A "start-code based" framing, and other requirements, was introduced by MR #5

This RFC is to discuss:

The goal is to provide:

Specificities of AV1 bitstream

Unlike most video codec bitstreams, the AV1 specification has provisioned a flag (obu_has_size_field) in OBU headers and a variable-length field (obu_size) to be able to specify the size of the OBU payload.

This allows elements and hardware that process AV1 bitstream to easily skip/scan through OBU without requiring any other form of packing provided the container format specifies the beginning of one OBU.

This "size" feature is not present in any other major video codec bitstream, explaining why they have to resort to using a "startcode-based" system and provisioned their bitstream to support it (by having "emulation-prevention" bytes within their bitstream).

Lower overhead

The obu_size feature of AV1 bitstream provides a more compact bitstream than the "startcode-based" proposal:

The "startcode-based" format also requires modifying the bitstream to insert emulation-prevention bytes where needed, further increasing the payload.

Not scanning whole bitstream

Due to having the OBU size specified in the bitstream, this also allows direct seeking/skipping over the OBUs, instead of scanning for a startcode.

The only requirement for this is for the container to specify where a single OBU starts, which is easily done by mandating that the AV1 PES payload starts with a OBU header.

Compatibilty with existing hardware and software

Existing hardware and software that are fully compliant with the AV1 specification would require extra processing in order to be compatible with the proposed "start-code" format:

While less complex than the emulation-byte handling, the proposed "startcode-based" framing does not mandate the presence of obu_size (which the standard "Low Overhead bitstream format" mandates). This would also require re-computing the OBU header to re-insert (or remove) that mandatory obu_size.

Using the "Low overhead bitstream format" from the base specification avoids this complexity overhead and avoids potential issues/pitfalls when transforming the bitstream.

Informational : Why the Annex B "Length-delimited Bitstream Format" is not suitable

While tempting and slightly less complex, the Annex B formatting requires handling at the "Temporal Unit" level, which is not compatible with the proposed Access Unit PES framing which is at the "Frame Unit" level.

Creating such a bitstream would require accumulating the various "Frame Unit" in a "Temporal Unit" in order to compute the temporal_unit_size, introducing excessive latency.