Open jdsmith3000 opened 8 years ago
A few questions to make sure I understand the proposal.
initDataType
and the format of the Initialization Data?'cenc'
only be used when the pssh
boxes have version 0?
pssh
box with version 1 would be 'cenc1'
or something similar?Some initial concerns:
initDataType
in the encrypted
event.'cenc'
, 'cenc1'
, etc.The concern is broader than just the initData format. Content will need to evolve, and we need a defined mechanism to support that. In the case of CENC, the pssh box version represents a useful indicator of broader content changes. The difference between 'cenc'
and 'cenc1'
would be meaningful to the implementation.
I'm not sure offhand that we need to support mixed 'pssh' versions. I'd be interested in what others think.
We do believe a change like this is necessary. We believe CENC will evolve, and we don't have a mechanism now to detect changes.
I thought that might be the case. However, initDataType
is not the appropriate place to communicate these differences in the stream. (See https://github.com/w3c/encrypted-media/issues/105#issuecomment-189072229 .)
Differences in content need to be communicated via the contentType
. I don't have a good solution for this at the moment. Perhaps we should change the title of this issue or open a new one to continue this discussion.
I'm not sure offhand that we need to support mixed 'pssh' versions. I'd be interested in what others think.
I do think this is necessary. Implementations and key systems will adopt new versions on different schedules. For example, the Common System requires version 1, but I suspect other systems are fine with version 0 in most cases.
I'm not proposing this for contentType
changes, but for CENC ones. initData
and what it represents will change and cannot currently be expressed to apps.
From the spec POV, Initialization Data and initDataType
are (mostly) independent of the contentType
or media data. As you noted, "The concern is broader than just the initData format."
You may be conflating a file format specification (ISO BMFF and CENC) with a mostly orthogonal data format (identified by "cenc"
). It just so happens that the file format can contain a specific data format, but this is not strictly necessary. It may help to mentally replace "cenc"
with "psshboxes"
(and similarly, "webm"
with "keyid"
). The current names are historical accidents/mistakes.
I understand that we may need an additional way to describe the content, but initDataType
is not it (even though it may initially seem related). I believe this probably needs to be handled in the MIME type somehow. It should be possible to describe the content in a MIME type independent of EME. To that point, it is odd that:
"video/mp4"
)."video/mp4"
)."video/mp4"
uses the CENC protection scheme."video/mp4"
).Other than item 3, these are independent of EME. All are independent of initDataType
.
I believe the best way to signal the protection scheme is via the interface between the media stack and the content decryption module, as the signaling information is generally part of the media container, which the media stack has access to. It is also consistent with the handling of other encryption-related data, namely IVs and subsample encryption information.
- There is no MIME type distinction between clear ISO BMFF and encrypted ISO BMFF (both use
"video/mp4"
).
This problem also exists for "video/webm"
, though items 2-4 do not.
We should also consider these issues when defining support for MPEG-2 TS CENC (#106).
Triaging per https://lists.w3.org/Archives/Public/public-html-media/2016Mar/0003.html. This is important, and we should continue discussing, but I don't believe it affects the spec.
I would be against splitting the 'cenc' initDataType or extending it with version information. This would really complicate things for me in a DASH client, and I don't see any benefit to either the application or to the CDM/browser.
The point wouldn't be to version initDataTypes
arbitrarily, but to do so on meaningful changes that might CDM support. One way to do that is to tighten the definition of what we consider cenc
now, so that if (or when) it changes in the future, apps have tools to detect the difference and gracefully handle the content.
We believe this issue should be considered V1 because it would better position EME for future change.
Can you give specific examples of such changes? How do you propose to tighten cenc
? Also, my initial concerns remain unaddressed.
Some of your concerns discuss imprecisions of our use of MIME types, and aren't specifically issues that I was trying to resolve. I opened this issue to specifically discuss the evolution of cenc
and compatibility over time with EME. If pssh box versions rev (as they did from V0 to V1), that should identify a meaningful difference that EME should recognize, and handle by checking the initiDataType
support from the CDM. That suggests we add identifying criteria to the registry for cenc
beyond just the presence of pssh boxes. I've proposed trying to limit the pssh version to v0, and still think that is desirable. If it's not feasible, then recognizing either v0 or v1 would at least anchor the initDataType to a version, and provide a path for it to change.
I negotiate with MediaKeys
before I append an init segment, and therefore before I know what initDataType
I'm going to get from the encrypted
event.
That means you know the possible initDataTypes
in advance. You presumably control your content and what initDataType
it implements. That wouldn't need to change, would it?
If we can limit this discussion to the Initialization Data format (i.e. PSSH box format), great. However, I'm concerned there may be other assumed implications - now or later - about the meaning and/or how it might be used. For example, [1]. Separately, I do think there are MIME type issues that need to be solved - perhaps we should file a separate issue or raise them in an appropriate forum.
As for the specific issue of "checking the initiDataType
support," I can see the theoretical potential usefulness of detecting PSSH version capabilities in requestMediaKeySystemAccess()
to determine whether the client can support the media streams and/or specific behaviors. (I wonder whether it would really be useful since a content provider is likely to have fallback PSSH boxes to support older clients.) However, I'm not sure it makes sense in the other places where initDataType
is used - MediaEncryptedEvent
and generateRequest()
. This is also where the issues of multiple PSSH box versions is an issue.
Thus, I think we have three questions:
requestMediaKeySystemAccess()
?MediaEncryptedEvent
and generateRequest()
).[1] "The concern is broader than just the initData format. Content will need to evolve, and we need a defined mechanism to support that." -- https://github.com/w3c/encrypted-media/issues/149#issuecomment-189451188
I don't see the specific issues with using specific initDataTypes
in MediaEncryptedEvent
and generateRequest()
. Please elaborate.
My primary concerns are with respect to initData format and how initData is used in content.
I don't see the specific issues with using specific
initDataTypes
inMediaEncryptedEvent
andgenerateRequest()
. Please elaborate.
These have been discussed above. For example, forcing applications to handle "cenc"
, "cenc2"
, etc. and the fact that the initData may contain various versions of PSSH boxes.
My primary concerns are with respect to initData format and how initData is used in content.
I'm not sure what you mean here, but this seems more about the content than the initData format. If a media format and/or initData format is changed such that the initData is used differently in that content, that seems like we need to change how the content is described more than the initData.
Again, it would really help this discussion to have specific examples, even hypothetical.
The hypothetical situation is when/if cenc
changes in a way that requires CDM changes to work, how will that be detected and supported? Content standardization work is ongoing, and it is reasonable to expect that changes like this will happen. The MSE type
accepts codec string versions, but we've not done anything to allow similar contentType
evolution in EME.
I don't believe apps would be forced to deal with updated contentType
strings. As a general rule, we would not expect these to rev unless they carried a necessary distinction for CDM support. Applications would pick up changes only if they cared about them.
Versioning the pssh box is an option we could check now that we believe would satisfy this need going forward. Alternatively, new CDM capabilities could be expressed through keySystem
strings, though that is indirect and would require implicit app knowledge of content decryption requirements.
Correction: I meant the initDataType
for the cenc versioning discussion above. contentType
uses the MIME type and codec strings.
I propose specifically that we be more specific on what we assign initDataType = “cenc”. I’m not proposing we create any new initDataType strings now, but create the potential to use these in the future.
Options are:
The first option sounds fine to me. Since v1 already exists, and we use it for the common format, I think we should include it. I think we could say that this currently includes v0 and v1. If/when there is a v2, we would still have the option to use this same type if appropriate, especially if the format is backwards compatible.
Regarding the second option, the location is not really a property of the type or format. For example, an application can derive the correctly-formatted data from anywhere and pass it to generateRequest()
. However, we can specify under what conditions a user agent should fire an "encrypted'
event and with whatinitDataType
value. (These are currently combined in one section, but #105 should make this clearer.)
We might say that such events should only be fired when a 'pssh' is encountered within 'moov' and 'moof' if those are the only currently supported locations. We could add it to the existing sentence that starts "Each time one or more 'pssh' boxes are encountered..."
As for limiting the events to 'moov', I'd like to hear from others that are familiar with how VOD content is currently packaged and what applications currently expect. It's possible that some limitation would reduce the number of "encrypted"
events when adapting, but I don't know whether that's practical (i.e. if that's the only time we'd get such events in some cases).
However, we should also consider that any limiting of when to fire an event or vary the initDataType
based on location/box hierarchy could complicate implementations (i.e. that simply walk the boxes).
@jdsmith3000, what do you want to do with this issue? Since it doesn't affect the main spec and we are unlikely to get to it this week, I'm going to move it to VNext. We can address it at any time, though.
The current EME and EME Registry specs allow apps to match
initDataType
support between CDMs and content. This matching works today, but is limited to a current snapshot ofinitDataType
capabilities. These are very likely to evolve, and to present changes to EME that will not be detectable with our currentinitDataType
matching. We need a defined way to handle changes, so that existing content continues to play on current implementations, AND new content can be detected by current implementations and played whenever possible.For CENC, we specifically propose that the ‘pssh’ box version (e.g. ‘pssh’ version 0 now) be included in the ISO Common Encryption EME Stream Format and Initialization Data spec as a condition for the ‘cenc’
initDataType
. This change would make ‘cenc’initDataType
matching more explicit, and would provide a mechanism for future content evolution.