Dash-Industry-Forum / DASH-IF-IOP

DASH-IF Interoperability Points issue tracker and document source code
32 stars 7 forks source link

[CPS] Uniform client workflow for multi-DRM scenarios #300

Open sandersaares opened 5 years ago

sandersaares commented 5 years ago

Following up on email threads, I make a proposal here to lay out a uniform model to handle basic multi-DRM interactions in DASH clients and services in a common way. This model would provide default behavior for various aspects but still allow a client to deviate from these defaults (if custom non-interoperable behavior is desired).

Benefit to the industry: avoid needless implementation-defined behavior, increasing interoperability without limiting implementation flexibility.

Benchmark for success: paste a protected video URL into dash.js and watch it acquire proof of authorization, attach it to a license request, know all the URLs of all the relevant services, and obtain the relevant licenses and content keys driven exclusively by data in the MPD.

I can organize test content together with conforming server-side APIs for authorization and license acquisition if DASH-IF decides to include this mechanism in its recommendations.

The initial draft proposal follows (to be restructured as pull request if initial discussions achieve positive result).

Uniform client workflow for multi-DRM scenarios

This proposal specifies a content annotation and client processing model to enable common DASH multi-DRM scenarios to be implemented in a manner that is agnostic to a specific DRM system or to a specific DRM service vendor.

Part of this is already described by the security chapter of IOP v4.3. This proposal extends the scope and defines new DASH-IF specific constructs that cover aspects previously left implementation-defined.

Scope

A DASH client acquiring content and content keys over HTTP is assumed. The following aspects are in scope:

  1. How does a client determine which DRM system to activate and what initialization data to supply to it.
  2. How does a client determine the required set of keys and signal this to the DRM system.
  3. How does a client determine the URL of the license service.
  4. How does a client attach proof of authorization to the license request (if such proof is required).
  5. How does a client obtain proof of authorization (if such proof is required).
  6. How does a service provider signal defaults (preferences) in the MPD to govern client DRM behavior.
  7. How does a client interpret license acquisition feedback encountered in negative scenarios (e.g. authorization denied)?

Existing coverage

IOP v4.3 makes some relevant recommendations (paraphrased):

  1. default_KID is the contract between the DASH client and the DRM system. The client requests access to one or more default_KIDs and the CDM initiates the DRM operations (license requests) required to enable this access.
  2. For W3C Clear Key a <laurl> element may be used to signal the URL of the license service.

This proposal builds upon these existing recommended practices.

License service URL

We define <dashif:laurl> that may be added under any ContentProtection descriptor. This provides the license service URL.

When both this element and a DRM-system-specific element is present, the client will use the latter.

Any license service URL provided by any MPD data is a default value that may be overridden by the DASH client.

Client and DRM system interactions

IOP already specifies this in sufficient detail, defining default_KID as the central contract. See chapter 7.7.9.

License requests must follow DRM system protocol

Each DRM system has its own request/response protocol. This is fine and the normal thing to do is use a DRM-system-specific protocol for license acquisition (at least as long as there is no standard one).

So, let's require that interoperable clients do the normal thing! In other words, clients shall not modify the license request generated by the CDM and will forward it as-is to the license service and shall not modify the license response generated by the license service and will forward it as-is to the CDM.

Note: DRM technology providers tend to even require that implementations be compatible on a protocol level, so it could be said that this constraint also exists elsewhere already.

This constraint only applies to the body - request/response headers are always open to extensibility.

License authorization model

A license service expects proof of authorization to be provided by the DASH client when performing a license request. It is the duty of the DASH client to provide that proof, which it obtains from an authorization service. The MPD can inform the client where it can get the proof of authorization.

Note: This is the primary model that separates all the building blocks but alternative models (merging them somewhat) are also described below and technically equivalent in terms of client processing model.

Obtaining proof of authorization

The authorization service URL is provided by a <dashif:authzurl> element that may be added under any ContentProtection descriptor. This is a default URL that may be overridden by the DASH client. If no URL is present in the MPD and no overriding URL is available to the DASH client, no proof of authorization is provided to the license server (this facilitates the alternative models described below).

The proof of authorization is obtained by performing a HTTP GET request to the authorization service URL. Relative URLs are relative to MPD URL. DASH clients may coalesce requests for proof of authorization of multiple default_KIDs if they use the same authorization service URL. The URL query string will contain a kids parameter listing the comma-separated default_KID values that authorization is requested for.

Example in MPD:

<dashif:authzurl>https://example.com/tenants/5341/authorize</dashif:authzurl>

After adding KIDs

https://example.com/tenants/5341/authorize?kids=e34bd077-913f-455c-8173-de268d84eef8,ab7ef3c5-e1f9-4aa6-b6b7-ad2cfd51cdf1

In order to execute authorization logic, the authorization service needs to know who the caller is. The mechanism for caller identification is implementation-defined but would presumably be based on HTTP headers (cookies, device ID, authorization headers or similar) that are defined by the app hosting the DASH client when establishing a session.

In case of 200 OK, the response body contains the proof of authorization as a JSON Web Token in compact encoding (aaaa.bbbb.cccc). The contents of the token are opaque and DRM vendor specific. Standard JWT header fields (in the aaaa part of the token) may be used by the DASH client to determine token expiration/validity (e.g. in case it decides to store the token for later use/reuse).

In case of an error response, the response code shall be appropriate (e.g. 403 for access denied) and the response body shall be sturctured as an RFC 7807 conforming JSON object providing details of the error, with Content-Type indicating application/problem+json.

Providing proof of authorization in a license request

If no authorization service URL is available for a given default_KID, no proof of authorization is provided by the DASH client.

License requests for default_KIDs that require separate proofs of authorization (i.e. that were not coalesced by the DASH client into a single proof of authorization request) must be made separately, attaching one proof to each license request. (Note: the flip side of this assumes the CDM provides some API to coalesce license requests, which is not the case with present day EME, although may be the case in future versions).

Proof of authorization is attached as an Authorization HTTP request header with the Bearer type and the bearer token being the authorization JWT in compact form (without further encoding).

Alternative license authorization models

In some deployment models, a license proxy is used. It acts as a license service in terms of API but in reality performs the authorization checks and then forwards the license request to the actual license service, acting as a proxy.

In some deployment models, the license service itself also executes the authorization checks.

As far as a DASH client is concerned, in both of the above cases it is talking directly to a license service that simply requires no proof of authorization to be passed to it. The DASH client still needs to pass along any client-identifying data (e.g. HTTP cookies) so the service can execute its authorization logic.

Other models not described here

The above license authorization/acquisition models cover common scenarios seen on the internet. For sure there will exist DRM implementations that do not conform to these models (e.g. they use a custom protocol or do not even use HTTP for license acquisition or that require the proof of authorization to be passed in a different way). That is perfectly fine. Such implementations are simply not fully interoperable and will not benefit from the seamless integration that they would gain if they followed industry guidelines.

What keys become available is not always what keys were requested

The authorization token need not authorize the full set of default_KID values or, indeed, specify what set is authorized. After license acquisition, the DRM system will have a certain set of keys available after the license acquisition workflow. This set might not include everything that was requested - maybe some were not authorized, maybe some were authorized with conditions that cannot be satisfied by the CDM (e.g. HDCP is required but CDM does not support enforcement).

Clients must accept the fact that the set of available content keys might not be what is desired by the client.

Providing license acquisition error feedback

Existing license services all implement feedback using a custom mechanism. We define here a small step toward a standard feedback mechanism.

In case of an error response from the license service (as identified by non-200 status code), the response body may be sturctured as an RFC 7807 conforming JSON object providing details of the error, with Content-Type indicating application/problem+json. Any other content-type indicates a custom implementation-specific feedback format. When RFC 7807 error data is returned, the HTTP status code shall accurately describe the error (e.g. 403 for access denied, 503 for availability issues, 500 for logic errors, 400 for bad input).

The MPD provides defaults

Anything in the MPD is just default values that may be overridden by the client at will. Indeed, the MPD may provide nothing at all (not even any DRM-specific ContentProtection descriptors), in which case the DASH client must come up with all the data on its own.

Client choices in DRM selection and initialization

First, what is required to activate a DRM system?

To explain the last bit - in practice, 95% of the PSSHs I see are just wrappers for the key ID and contain no other useful data, which is why I consider them also synthesizable for PlayReady and Widevine. Of course, this is not a universal rule.

Which systems are even candidates? I would consider the following:

Some DRM systems require initialization data to be provided. Where can initialization data come from?

  1. The MPD provides default initialization data.
  2. This can be overridden by the API if the app chooses to do so.

How the initialization data is provided (or whether it is at all) should have no impact on DRM system preference - it just determines if the DRM system remains a candidate or not.

I would only consider using init segment data as a last resort fallback (with an appropriate warning emitted to log output). It is not something that should happen with good content. I would only use that data if none of the candidate DRM systems can be activated based on API/MPD provided initialization data. When ClearKey is a candidate, this would never be the case as ClearKey can always be activated because its initialization data can be synthesized.

I would treat config like license server URLs the same as initialization data.

It may be that multiple DRM systems can be used (e.g. Chromecast supports both PlayReady and Widevine and I guess also ClearKey). In such a situation the app should be able to select a preferred DRM system. This choice may depend on different factors both technical (e.g. desired output resolution) and nontechnical (arbitrary clauses in contracts), so the mechnaism of choice should be entirely up to the app (maybe give some callback a list of available systems and ask for order of preference). This mechanism could also be used to exclude certain DRM systems that the app does not wish to consider as valid candidates for activation.

Note that DASH-IF IOP guidelines forbid mixing of ClearKey signaling with real DRM system signaling in the same MPD. Mostly as just a precaution against exposing content keys through accidental ClearKey inclusion.

sandersaares commented 5 years ago

Updated based on initial feedback received.

sandersaares commented 5 years ago

Merged last section from linked issue, for easier readability.

sandersaares commented 5 years ago

Adjusted text to make it a bit more clear that this provides default behavior for interoperable scenarios, without constraining custom use cases.

sandersaares commented 5 years ago

Proof of concept implementation has progressed a bit.

Axinom DRM now has endpoints accepting requests following this protocol: https://github.com/Axinom/Axinom.Drm.BearerAuthLicenseServerProxy

DASH-IF now has a test token provider service: https://github.com/Dash-Industry-Forum/test-vectors-drm-authz-token-provider (hardcoded tokens only for now)

Test video is available at: https://media.axprod.net/TestVectors/v7-MultiDRM-SingleKey/Manifest_1080p_Issue300Signaling.mpd

I am unable to contribute a client prototype, so the workflow cannot really by exercised at the moment due to lack of client support. In a client implementing this proposal, the above video would play without the need for any configuration of the player - all the DRM interactions would be MPD-driven.

Next action item for me is to format this as a pull request and go over the wording in detail, to align it with the main document. As always, comments & feedback most welcome!

sandersaares commented 5 years ago

PR created.

meladc commented 5 years ago

Thanks for putting this all together. A common interoperable client behavior would be a great achievement. One comment regarding authorization:

The URL query string will contain a kids parameter listing the comma-separated default_KID values that authorization is requested for.

I believe that the authorization should be based on contentID rather than on keyIDs.

Usually, authorization systems deal with content related entities, such as offers, packages, channelIDs, assetIDs, and the link between the content and keys is deferred to the DRM subsystem. The interface between the authorization subsystem and the DRM subsystem can be vendor specific, and does not need to be transparent to the DASH client. The client should get involved only when the license is returned. Then it can evaluate what keys are available compared with what keys are required. This step is needed anyway as there might be cases where not all authorized keys are returned in the license.

Authorization based on contentID also allows implementation of a stateless scheme in which keyIDs can be deterministically derived from contentID, hence sparing the need for managing a keystore DB to perform reverse mapping from keyID to contentID.

Authorization based on contentID can also assist with authorization of future keys, for which the exact keyIDs are not yet known. contentID has been mentioned in issue #338.

Based on the above, I would suggest to add an option to authorize content based on its contentID, and to specify that a DASH client needs to allow the application to configure it to use one of the options - authorization based on keyIDs or on contentID.

I wouldn't go for the naive option to flag the authorization method in the manifest, as this is not a content property, but rather a system one. if more than one system uses this content, authorization method can be different in each system.

sandersaares commented 5 years ago

Yes, the proposal does leave room to perform authorization based on the content ID. As the content ID is a mechanism opaque to the DASH client, this does not require action by the DASH client, rather it requires action by the MPD author to embed the content ID into the URL.

For example, the MPD could include an authorization service URL of the following form: https://example.com/authorize?contentid=foo-bar-123

The kids parameter would still be added but can be ignored by the authorization service.

Possibly I could better emphasize this capability in the text.

meladc commented 5 years ago

Thanks for this clarification. I also understand that specifying that the authorization token will be a

a JSON Web Token in compact encoding (aaaa.bbbb.cccc).

comes to serve the ability of a standard DASH client to use

Standard JWT header fields (in the aaaa part of the token) ... to determine token expiration/validity (e.g. in case it decides to store the token for later use/reuse).

However, since

The contents of the token are opaque and DRM vendor specific

and may also be encrypted for privacy reasons, I believe that it is important not to specify (as I've seen in other drafts) that the authorization information itself must include KIDs, or have a specific structure, unless there is a good reason for such a constraint.

As mentioned above, authorization services that use contentID, may not even be aware of the KIDs used for the content, and will not include KIDs in their authorization token. I don't believe that this approach should be excluded from the IOP.

Hope this request makes sense.

sandersaares commented 5 years ago

I agree.

sandersaares commented 5 years ago

From the discussion around https://github.com/cta-wave/device-playback-task-force/issues/56 I understand there is a need to further outline the distinction between a DASH DRM system and an EME key system, which are not 1:1 equivalent. In effect, there can be multiple implementations of the same DRM system offered via EME, which the player must resolve during capability negotiations with the platform.

The EME robustness parameter is one of these capabilities. A platform may offer different implementations of the same key system at different robustness levels.

Furthermore, there is a "server certificate" concept used by some DRM systems (e.g. Widevine). We may benefit from some standard signaling in DASH for such a concept (in a DRM system agnostic manner).

KarlGallagher commented 4 years ago

@sandersaares I think it is worth considering / pointing out the impact of the proposed error handling protocol on embedded DRM clients or CE devices for which applications are only given access to a 'black box' DRM implementation.

In these cases, signalled errors (in HTTP header or body) are currently likely to be swallowed by the underlying implementation.

Maybe this means there needs to be some explicit guidelines on expected complaint client behaviour defined in the IOP??

sandersaares commented 4 years ago

The DASH-IF implementation guidelines assume an EME-equivalent implementation, where any HTTP requests are performed by the app code, not the DRM system implementation. This is in line with what CTA WAVE is doing and what is also common on many Android based platforms (Android APIs are quite similar to EME/MSE in many regards).

That being said, I can certainly imagine alternative implementations existing in practice. What exactly would you propose as the appropriate guidance to publish in such a situation, though?

KarlGallagher commented 4 years ago

@sandersaares I fully understand your position here, however there do exist some high profile CE platforms for which the application has less control than anticipated here.

I think, at the least, it should be recommended that when Status Code indicates error, a compliant client should pass the body (if any) back to the consuming application.

That way, the parsing/interpretation of error response contents can still be handled in the application layer.

Thanks