AOMediaCodec / av1-isobmff

Official specification of the AOM group for the carriage of AV1 in ISOBMFF
https://AOMediaCodec.github.io/av1-isobmff
64 stars 16 forks source link

AV1 Encryption of large samples #193

Open cconcolato opened 6 months ago

cconcolato commented 6 months ago

The AV1-ISOBMFF specification indicates that each tile in an AV1 frame must be encrypted separately. When a sample contains many frames and each frame uses many tiles, this can lead to a large number of encrypted ranges and describing this with the saiz box may not be possible.

Per Common Encryption (3rd edition), the syntax of auxiliary information is:

aligned(8) class CencSampleAuxiliaryDataFormat
{
unsigned int(Per_Sample_IV_Size*8) InitializationVector;
if (sample_info_size > Per_Sample_IV_Size )
{
unsigned int(16) subsample_count;
{
unsigned int(16) BytesOfClearData;
unsigned int(32) BytesOfProtectedData;
} [subsample_count ]
}
}

So each tile costs 6 bytes.

The saiz box syntax is as follows, note that sample_info_size is on 8 bits (i.e. auxiliary information has to be less than 256 bytes).

aligned(8) class SampleAuxiliaryInformationSizesBox
extends FullBox('saiz', version = 0, flags)
{
if (flags & 1) {
unsigned int(32) aux_info_type;
unsigned int(32) aux_info_type_parameter;
}
unsigned int(8) default_sample_info_size;
unsigned int(32) sample_count;
if (default_sample_info_size == 0) {
unsigned int(8) sample_info_size[ sample_count ];
}
}

For an MP4 sample with 6 AV1 frames, each with 8 tiles, the sample will need 48 encrypted ranges, and the size of the auxiliary information will be 48*6+2 = 290 which is greater than what can be indicated in sample_info_size.

This was raised at MPEG with this contribution https://dms.mpeg.expert/doc_end_user/documents/142_Antalya/wg11/m62549-v1-m62549_isobmff_saiz16bit_extension.zip (MPEG Member only). The proposal at MPEG is to define version 1 of saiz to use 16 or 32 bits.

An alternative proposal could be to stop using saiz/saio and only rely on senc which does not exhibit this problem.

Both changes seem to be breaking changes. It needs to be decided what is preferable.

cconcolato commented 5 months ago

We are wondering if we could allow encrypting the tile header (not the frame header).

cconcolato commented 5 months ago

@kqyang would you have an opinion on this issue?

kqyang commented 4 months ago

I prefer relying on senc since it is simpler and it is already supported on some clients, e.g. Chromium ignores saiz if senc is available.

On the other hand, v1 saiz with 16 or 32 bits aux size may break Chromium as it assumes the 8-bit aux info size right now (link).

For clients that rely on saiz, they need to be updated to use senc if it is available. (These clients need to be updated regardless)

In summary, v1 saiz is a breaking change for all clients while relying on senc is only a breaking change for clients that do not do it yet.

Assuming that it is only a problem with AV1 class of codecs, I think we can update the AVx encryption spec to make it mandatory when the media is encrypted.

@joeyparrish FYI.

wantehchang commented 4 months ago

On the other hand, v1 saiz with 16 or 32 bits aux size may break Chromium as it assumes the 8-bit aux info size right now (link).

@kqyang Should we change Chromium to ignore the SampleAuxiliaryInformationSizesBox if it is not version 0?

ISO BMFF (ISO/IEC 14496-12:2022) says in Section 4.2.2:

FullBoxes with an unrecognized version shall be ignored and skipped.

kqyang commented 4 months ago

Should we change Chromium to ignore the SampleAuxiliaryInformationSizesBox if it is not version 0?

Yes for spec compliance but it may not give you the expected results.

Chromium relies on senc if it is available. However, it still parses all the known mp4 boxes and we don't want the parsing to fail.

The correct handling in Chromium would be to parse senc first and only parse saiz if senc is not available. That will allow Chromium to work even if the mp4 file contains a v1 saiz.

cconcolato commented 4 months ago

A Pull Request was merged to the AV1-ISOBMFF specification to inform of the problem (see https://github.com/AOMediaCodec/av1-isobmff/pull/195/files).

At the moment we can think of the following courses of actions:

  1. The group could add a normative reference that says:

    When encrypting a video stream that exceeds the encryption limit of saiz v0, the file writer SHALL omit both saiz and saio and readers SHALL support only the presence of senc.

  2. The group waits on MPEG to finalize its new version of ISOBMFF (that will either define saiz v1 (16 or 32 bits TBD) or a new saz2 box) and then updates the specification to mandate support for the new technologies