w3c / encrypted-media

Encrypted Media Extensions
https://w3c.github.io/encrypted-media/
Other
180 stars 80 forks source link

"cenc" Initialization Data Format: 'pssh' box selection #440

Closed tinskip closed 1 year ago

tinskip commented 6 years ago

Section 1 of the "cenc" Initialization Data Format specification states that: "The format is one or more concatenated Protection System Specific Header ('pssh') boxes [CENC], each for a unique SystemID. One of the concatenated 'pssh' boxes should use the Common SystemID and PSSH Box Format."

Would it not be simpler for the application to just grab all the PSSH boxes and concatenate them into init_data without looking at their contents? What if there is more than one box for a specific system ID? If there are multiple PSSH boxes with the same SystemID, how should the application pick one? It would seem that this requirement adds complexity without value.

There are conditions under which multiple PSSH boxes for the same DRM system may be appropriate, such as when having different versions which are understood by different versions of a CDM, and which may be arbitrarily large. The CDM would select the one for a specific version or feature set, and use it when requesting a license. The only other way to do this would be having multiple system IDs for the same DRM system, which is overkill.

joeyparrish commented 6 years ago

I believe that the behavior in Chrome is to concatenate all adjacent PSSH boxes together into one "encrypted" event without inspecting their system IDs. I don't know this for a fact, though. I'm sure someone could find the source and confirm.

I would suggest that the clause "each for a unique SystemID" be removed.

As for the application, apps should not need to "pick" one at all. The concatenated group of PSSHs is valid init data and can be provided to EME as-is. The user agent or CDM is responsible for making the choice of a supported PSSH from the init data.

ddorwin commented 6 years ago

Yes, this language can probably be loosened a bit, especially to handle versioning. I don't recall the exact reason, but the purpose was probably to ensure consistent behavior for different Key Systems (i.e., not relying on multiple PSSH boxes for one but not others) and interoperability. UA implementations also need to be able to sanitize initData and "remove entries that are not needed by the CDM." It seems reasonable to say that only one of the PSSH boxes should be required to generate a license.

The Processing section probably handles this correctly, though it might be a good idea to provide more recommendations on the ordering of boxes when producing such content.

joeyparrish commented 6 years ago

In a situation where a UA is very distinct from its CDM (such as Widevine on Firefox, where different parties are responsible for each), it might not be feasible for the UA to know which entries would be needed by the CDM. Only the CDM would be able to fully interpret the init data.

Would there be any harm in changing the language on that as well?

ddorwin commented 6 years ago

"remove entries that are not needed by the CDM" follows a "SHOULD," so I don't think that language needs to change.

Even when the UA and CDM are distinct, the UA is supposed to have sufficient information about the CDM. See, for example, https://w3c.github.io/encrypted-media/#cdm-security. As far as I know, Firefox, specifically integrates a specific version of the CDM and could thus obtain the necessary information. A more problematic case would be a user agent that just uses a platform CDM without such knowledge. As noted in the link above, that is generally problematic and discouraged.