video-dev / hls.js

HLS.js is a JavaScript library that plays HLS in browsers with support for MSE.
https://hlsjs.video-dev.org/demo
Other
14.65k stars 2.56k forks source link

PlayReady EME "encrypted" PSSH initData payload ignored #6005

Closed jrivany closed 2 weeks ago

jrivany commented 9 months ago

Is your feature request related to a problem? Please describe.

I'm looking at supporting media encrypted with multiple DRM systems (i.e. playready, widevine), in doing so I've generated MP4 init segments containing 2 PSSH boxes.

Currently the PSSH parsing code that handles the initData payload is written to assume there is only one PSSH box being passed:

https://github.com/video-dev/hls.js/blob/master/src/utils/mp4-tools.ts#L1300

However the CENC initialization data spec states that this could be multiple boxes concatenated:

https://www.w3.org/TR/eme-initdata-cenc/#format

Describe the solution you'd like

I would like if the EME constroller could support media that contains multiple adjacent PSSH boxes for different key systems.

Additional context

No response

robwalch commented 9 months ago

Hi @jrivany,

Are you providing the correct KEY tags in your HLS playlists?

The parsePssh function you are referencing was added specifically for assets with clear-lead keys like https://storage.googleapis.com/shaka-demo-assets/angel-one-widevine-hls/hls.m3u8 where HLS.js receives an encrypted event from EME before encountering a fragment with a KEY tag in the HLS playlist. If it fails to extract a key ID, that shouldn't prevent the key system from being selected and key session started using playlist keys (assuming parsePssh returns null and not a bad key id).

That being said, I'd be happy to review and test changes that could support additional configurations.

jrivany commented 9 months ago

Hey Rob, thanks for the quick reply.

Are you providing the correct KEY tags in your HLS playlists?

In this instance no. I should provide more context: we're currently writing our own packaging layer, and were experimenting with the viability of different options.

In this particular experiment I was casually ignoring 4.4.4.4 in the spec, specifically:

If the Media Playlist file does not contain an EXT-X-KEY tag, then. Media Segments are not encrypted.

In theory there was no technical reason we couldn't rely solely on the PSSH box in the content itself, which currently works fine when there's only a single PSSH box.

This is definitely a more fringe, off-spec use-case that would only be a mild simplification on the manifest generation side, which I'm now reconsidering the value of.

robwalch commented 9 months ago

In theory there was no technical reason we couldn't rely solely on the PSSH box in the content itself, which currently works fine when there's only a single PSSH box.

This is definitely a more fringe, off-spec use-case that would only be a mild simplification on the manifest generation side, which I'm now reconsidering the value of.

For clear-lead playlists where the pssh is in the init segment, but the first segment or two are not encrypted, CDM setup can begin in response to appending the init segment and EME dispatching the "encrypted" event.

In this example, the pssh is present in "init.mp4", but first encrypted segment is "s3.mp4":

#EXTM3U
#EXT-X-VERSION:6
#EXT-X-TARGETDURATION:5
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MAP:URI="init.mp4"
#EXTINF:4.000,
s1.mp4
#EXTINF:4.000,
s2.mp4
#EXT-X-DISCONTINUITY
#EXT-X-KEY:METHOD=SAMPLE-AES-CTR,URI="data:text/plain;base64,AAY=",KEYID=0x800000000,KEYFORMATVERSIONS="1",KEYFORMAT="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed"
#EXTINF:4.000,
s3.mp4

If HLS.js didn't begin the key session on the "encrypted" event, it would not begin until just before loading the third segment - something to consider if you packaging media this way with more than one key system or multiple keys in the init segment.

Thanks for sharing the eme-initdata-cenc/#format details. I don't implement features without sample assets which is why this one only made it so far. I look forward to hearing which way you go.

cjpillsbury commented 1 month ago

Hey there @robwalch this actually showed up as an issue for us with multi-drm (widevine + playready pssh'es). We have an "in-app" (aka in the core of Mux Player) solution that accounts for it here https://github.com/muxinc/elements/pull/957, but I'm hoping to dig in and see if there might be some assumptions to unwind in e.g. the mp4 -> mp4 transmuxing. Happy to share test content out of band if you wanna dig in as well.

robwalch commented 1 month ago

Have a look at generateRequestWithPreferredKeySession. Would it help if hls.js included the "reason" for the key-session request in the generateRequest callback? If this is called more than once or with different reasons on prior to renewal, that would be because of the request being generated for a playlist-key or an encrypted media element event:

https://github.com/video-dev/hls.js/blob/cedf96d8ca40ada435dd985f64307261b5c4fcc0/src/controller/eme-controller.ts#L675-L718

It would help to know which of these you are encountering.

Please also have a look at the context argument for the generateRequest callback. initData and type are passed through from the "encrypted" event. It is the context: MediaKeySessionContext argument that carries decryptdata: LevelKey with parsed pssh which will be used as the pssh response if a generateRequest callback filter is not supplied.

robwalch commented 1 month ago

For ArrayBuffer/Uint8Arrat type issues please thumbs up or comment on https://github.com/video-dev/hls.js/pull/5849.

cjpillsbury commented 1 month ago

will follow up in more detail but the tl;dr:

  1. our init segments contain two pssh boxes, one for playready and one for widevine
  2. in a "hack environment" I've set up to do DRM + segmented media testing, the MediaEncryptedEvent will signal an initDataType: 'cenc' with both pssh'es as the initData. Passing these along to the CDM via generateKey() seems to work fine without modification (though I definitely wouldn't claim that will work for every permutation). aka the CDM appears to "figure out" which pssh to use (assuming minimally via its system id) and just tosses the other
  3. I'm almost postive (but will confirm), we're hitting the hls.js generateRequest() config function from the EME signalling via the MediaEncryptedEvent.
  4. in hls.js, by the time generateRequest() is invoked, the initData is actually a pssh box inside another pssh box. This doesn't appear to reflect either EME/CDM behavior or what the actual box structure looks like in the ISO-BMFF init segment. This is why I was assuming it might actually be a breakdown in the mp4 -> mp4 demux + remux. I have yet to dig in on that front though, so this is very much just a tentative theory.
robwalch commented 1 month ago

This is why I was assuming it might actually be a breakdown in the mp4 -> mp4 demux + remux

For pssh extraction (from playlist KEY data URI) see getDecryptData in level-key. For pssh parsing of init data in "encrypted" media events see onMediaEncrypted > parsePssh. I don't think in either case or in any part of hls.js is the mp4 pssh being modified (or read from fragment response data).

I'm almost postive (but will confirm), we're hitting the hls.js generateRequest() config function from the EME signalling via the MediaEncryptedEvent.

The [eme] log lines will contain messages from generateRequestWithPreferredKeySession indicating which path is being taken.

cjpillsbury commented 1 month ago

perfect will pull these threads. Thanks, @robwalch!

robwalch commented 1 month ago

Currently the PSSH parsing code that handles the initData payload is written to assume there is only one PSSH box being passed

The description of this issue points to code that only handles PSSH found in Widevine and PlayReady KEY tags (which should not include PSSH data from other key-systems).

Passing these along to the CDM via generateKey() seems to work fine without modification

That is the expected behavior.

robwalch commented 1 month ago

I think what's going on (@jrivany and maybe @cjpillsbury although not sure about the nesting issue) is you want parsePssh to return only the PSSH for the selected key-system (or return a dictionary for selection once a system is selected).

Is this required on a specific UA or by your license server?

cjpillsbury commented 1 month ago

@robwalch We own all of the pssh + manifest generation, so we can make changes as appropriate. I am pretty confident we're spec-compliant with ISO/IEC 23001-7 PSSH box signaling (plus the underlying widevine + playready specs/expectations) and afaik there isn't any official formalization of EXT-X-KEY beyond clearkey + fairplay usage, though we'd happily conform to whatever the norm is for players/playback engines (assuming there is one), including hls.js. I'm going to be diving into where the breakdown occurs under the hood in hls.js. Re: UA question - With our current multi-drm setup (which includes EXT-X-KEYs for fairplay, widevine, and playready, pssh boxes for widevine and playready, and a sinf box that can be used for fairplay + MSE signaling), we're only seeing an issue on windows ("Edgeium" with widevine disabled for testing purposes). Also, just to reiterate, this worked cleanly using a bare bones impl directly integrating with MSE + EME and simply relying on MediaEncryptedEvent signaling for PlayReady.

cjpillsbury commented 1 month ago

ah actually looks like this is, in fact, from the pssh generated by the EXT-X-KEY URI. We're b64 encoding the full pssh. and it looks like mp4pssh() in hls.js is assuming just the pssh data. If this is an industry standard, we should change our playlists. If this is a bit more "loosey goosey" (which was my understanding, though I could be mistaken), there's probably room for improvement here on the hls.js side, including:

  1. configurable processing of the EXT-X-KEY translation
  2. some dum dum checks on the URI value (including potentially checking if it's the pssh)
  3. better resilience on EME failures that are caused by (presumptuous) URI parsing, since the pssh signaling is more explicitly formalized.
cjpillsbury commented 1 month ago

Would it help if hls.js included the "reason" for the key-session request in the generateRequest callback?

Yeah being able to effectively filter in/out based on where the key is sourced from, I think that would be a solid improvement (now that I've gotten my bearings on root cause in this case)

cjpillsbury commented 1 month ago

fwiw just confirmed this minor change to mp4pssh would resolve our issue without changing our playlists:

if (keyids) {
    version = 1;
    kids = new Uint8Array(keyids.length * 16);
    for (let ix = 0; ix < keyids.length; ix++) {
      const k = keyids[ix]; // uint8array
      if (k.byteLength !== 16) {
        throw new RangeError('Invalid key');
      }
      kids.set(k, ix * 16);
    }
  // Code changes start here. Including above for context
  } else {
    const view = new DataView(data.buffer);
    if (
      // Looks like the data is a pssh box
      view.getUint32(0) === data.length && view.getUint32(4) === 0x70737368 &&
      // with a matching System ID
      new Uint8Array(data.buffer, 12, 16).every((idByte, idx) => {
        return idByte === systemId[idx];
      })
    ) {
      // So just return the data as the pssh in this case
      return data;
    // Code changes end here (bracketing newly nested if)
    } else {
      version = 0;
      kids = new Uint8Array();
    }
  }

current code, for reference: https://github.com/video-dev/hls.js/blob/master/src/utils/mp4-tools.ts#L1306-L1324

robwalch commented 1 month ago

ah actually looks like this is, in fact, from the pssh generated by the EXT-X-KEY URI.

Right. So you should have a log message that looks like: [eme]: – Generating key-session request for "playlist-key" ...

We're b64 encoding the full pssh.

Base64 encoding is expected for PlayReady. The encoded data is expected to be a PlayReady Object.

and it looks like mp4pssh() in hls.js is assuming just the pssh data.

There was a patch (#5699) that went in that perhaps belonged in getDecryptData (parse the PlayReady Object "Challenge" out to pass to mp4pssh), but instead went into eme-controller in unpackPlayReadyKeyMessage: https://github.com/video-dev/hls.js/blob/6ef8363ff0705458ce016e080f3f0dd664684a47/src/controller/eme-controller.ts#L974-L1012

The "Challenge" (the pssh payload) must be parsed from the PlayReady Object to create a valid PSSH that will then be supplied as initData. It looks like (#5699) worked around this by detecting the xml content and parsing the challenge out just before sending the data to the license server. The correct place to do this would be in getDecryptData where the KEY tag data is parsed to create the Widevine/PlayReady PSSH with mp4pssh.

cjpillsbury commented 1 month ago

We went ahead and updated our #EXT-X-KEY URI value to conform with the expectations in hls.js for PlayReady, but just to level set on the issue, I'm going to describe everything we had in our prior setup:

  1. Spec-compliant PSSH boxes (per ISO/IEC 23001-7) in our init segments for both Widevine and PlayReady:
Screenshot 2024-07-24 at 8 05 17 AM

As expected, the PlayReady (identified via its SystemId field) PSSH contains a PlayReady Object (PRO) as its Data field, with an enclosed, UTF-16 little endian PlayReady Header v4.3.0.0 (PRH) as its only record value.

Widevine also conforms in relevant ways, but it did not have any issues, so I'll not dive in on that one.

  1. Base64-encoded versions of those exact PSSH boxes described above for the #EXT-X-KEY URI values for both Widevine and PlayReady:

(Real world example no longer available bc we updated our servers. Below is the updated implementation that is consistent with hls.js's assumptions)

# NOTE: THE WIDEVINE B64 VALUE IN THE URI IS AND ALWAYS HAS REPRESENTED THE ENTIRE PSSH IN OUR IMPL AND
# WE NEVER HAD PLAYBACK ISSUES WITH WIDEVINE IN HLS.JS
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="data:text/plain;base64,AAAAknBzc2gAAAAA7e+LqXnWSs6jyCfc1R0h7QAAAHISEJzpiZ1pGCYl7bOsZcDYXhciWGV5SmhjM05sZEVsa0lqb2lPVGc0TXpBME9EWXdNamczTURBNE56YzJJaXdpZG1GeWFXRnVkRWxrSWpvaU9UZzRNekEwT0RZd05ERXpNall6T0Rnd0luMD1I88aJmwY=",KEYID=0x9ce9899d69182625edb3ac65c0d85e17,KEYFORMAT="urn:uuid:edef8ba9-79d6-4ace-a3c8-27dcd51d21ed",KEYFORMATVERSION="1"

# NOTE: THE PLAYREADY B64 VALUE IN THE URI IS *NO LONGER* THE ENTIRE PSSH AND IS INSTEAD THE PSSH'S 
# DATA FIELD/THE PRH, BECAUSE THAT IS ASSUMED BY HLS.JS'S IMPL FOR PROCESSING ITS VALUE
#EXT-X-KEY:METHOD=SAMPLE-AES,URI="data:text/plain;charset=UTF-16;base64,vgEAAAEAAQC0ATwAVwBSAE0ASABFAEEARABFAFIAIAB4AG0AbABuAHMAPQAiAGgAdAB0AHAAOgAvAC8AcwBjAGgAZQBtAGEAcwAuAG0AaQBjAHIAbwBzAG8AZgB0AC4AYwBvAG0ALwBEAFIATQAvADIAMAAwADcALwAwADMALwBQAGwAYQB5AFIAZQBhAGQAeQBIAGUAYQBkAGUAcgAiACAAdgBlAHIAcwBpAG8AbgA9ACIANAAuADMALgAwAC4AMAAiAD4APABEAEEAVABBAD4APABQAFIATwBUAEUAQwBUAEkATgBGAE8APgA8AEsASQBEAFMAPgA8AEsASQBEACAAQQBMAEcASQBEAD0AIgBBAEUAUwBDAEIAQwAiACAAVgBBAEwAVQBFAD0AIgBuAFkAbgBwAG4AQgBoAHAASgBTAGIAdABzADYAeABsAHcATgBoAGUARgB3AD0APQAiAD4APAAvAEsASQBEAD4APAAvAEsASQBEAFMAPgA8AC8AUABSAE8AVABFAEMAVABJAE4ARgBPAD4APAAvAEQAQQBUAEEAPgA8AC8AVwBSAE0ASABFAEEARABFAFIAPgA=",KEYFORMAT="com.microsoft.playready",KEYFORMATVERSION="1"
  1. Some comments in relation to the situation

Since we've changed our implementation server-side (and since we also identified a workaround client side via config), none of this is urgent for me/us. I'm just noting a few places that resulted in some pain on our side that could plausibly be improved by unwinding some in-code assumptions that should arguably be loosened (or at least conditionalized, per my quick/hack example code change ☝️).

robwalch commented 1 month ago

What the #EXT-X-KEY's URI value is "supposed to be/represent" for PlayReady and Widevine is not formally defined in any official specification afaik.

It should be a PlayReady Object. The fact that getDecryptData passes the entire decoded data URI to mp4pssh as pssh data (making an incorrect assumption as you've pointed out) appears to be a bug in hls.js.

The proposed solution for the bug in hls.js is to extract the pssh data (the 'Challenge' element) from the PRO. This can be achieved by moving the code from unpackPlayReadyKeyMessage to getDecryptData, where it actually belongs.

I suggest using a getRequest callback to extract the pssh before a license request is made, as in https://github.com/muxinc/elements/pull/957, as a workaround until a fix is released.

How does that sound? Would you be willing to contribute a fix for this, or would you prefer I make a PR? If so, would you help test the fix? We would be careful to maintain compatibility with either type of data in the KEY tag based on whether or not the base64 decoded data includes PRO XML or the expected system ID bytes.

cjpillsbury commented 1 month ago

I'm 💯 cool with working on a PR here, but I want to make sure there's alignment first.

It should be a PlayReady Object

I'm arguing that hls.js probably shouldn't assume (or at least should reduce the number of assumptions) that the URI value is anything in particular because it's not part of any specification. I don't think the current code is a bad assumption; I think it's a bad assumption. In other words, accounting for the fact that the value may be a b64 encoded PRO isn't bad, but also accounting for the fact that it may be a b64 encoded PSSH would be good. I've explicitly seen #EXT-X-KEYs with URIs generated from 3rd party DRM providers that exactly conform to your current implementation. The thing is, this isn't part of any actual specification (afaik), so swapping one assumption for another doesn't feel ideal to me either.

robwalch commented 1 month ago

I think we're in alignment. What I am saying is HLS.js should expect a PRO but account for when it is not.

robwalch commented 1 month ago

There are a couple of other points we should align on:

  1. The issue is in getDecryptData extracting (or identifying) pssh data in PlayReady KEY tags only
  2. mp4pssh does not require changes as the problem you identified is not related to embedding multiple PSSH boxes in KEY tags (this should have been filed as a new issue)
cjpillsbury commented 1 month ago

Aligned on both. I'll open another issue. Sorry for misappropriating/coopting your issue, @jrivany !

robwalch commented 1 month ago

No worries - It could be the same root cause with a similar conclusion. (Although I think there is a point to be made with this issue about how to handle and document initData from "encrypted" events vs KEY tags.)

If you need this as a patch please make changes against patch/v1.5.x. I'm happy to cut a patch and merge the fix into dev as a follow up.

robwalch commented 3 weeks ago

Related with workarounds suggested to filter session generation based on playlist or "encypted" event keys: #6636

robwalch commented 3 weeks ago

Renamed this issue for comments above: https://github.com/video-dev/hls.js/issues/6005#issuecomment-1833764908

PSSH parsing of initData was part of the problem, but that parsing was only for Widevine so that hls.js could initialize a session on clear segments with pssh payloads prior to requesting segments with KEY URIs in the playlist (see shaka-packager "clear-lead" example). #6640 will fix PSSH parsing with multi-key-system assets, but will continue to ignore PlayReady keys in the media.