waitingforkey as spec'd could fire at different times in different browsers

cpearce commented 7 years ago

The "waitingforkey" event is spec'd in "7.3.4 Wait for Key" to be fired immediately once the algorithm is called. This algorithm is called from inside the "Attempt to decrypt" algorithm. I know that both Firefox and Chrome decode ahead of the current playback position in order to smooth over any irregularities in decoding timings. However, I note that Firefox (when decoding in software) decodes 10 frames ahead but Chrome decodes 4 ahead (IIRC). So if we fire "waitingforkey" as soon as we hit a frame that we can't decode, we may end up still playing for 10 frames or so in Firefox and 4 frames in Chrome. This means we could also dispatch events such as "timeupdate" after the "waitingforkey". Confusingly, we'd be advancing the current playback position while the readyState is HAVE_METADATA.

I recall that the "waitingforkey" event is primarily as an indicator for developers that playback has stalled.

So I propose we change the "Wait for Key" algorithm to have an extra step, inserted before step 4 which enqueues a task to dispatch the event:

Wait until all decrypted and decoded media data has been played.

NOTE: As a result of this step, the "waitingforkey" event will not fire until all decryptable media data has been played, and playback has stalled.

This would enforce that browsers fire "waitingforkey" at a consistent time.

cpearce commented 7 years ago

That is, Firefox and Chrome could end up firing "timeupdate" events after firing "waitingforkey", as we'd still be advancing the current playback position, and we won't be playing for the same duration of time as our decoded frame queues are different lengths. So we could end up firing different numbers of "timeupdate" events after the "waitingforkey" event.

ddorwin commented 7 years ago

I agree that it doesn't make sense to "be advancing the current playback position while the readyState is HAVE_METADATA." "Suspend[ing] playback" probably doesn't make sense either if there are available frames.

Perhaps we should reposition/refactor this algorithm to be based on decrypted frames being exhausted.

I'm not sure we can/want to block as suggested in this algorithm. Maybe that's okay since this is within the (effectively) decoding algorithms and thus not blocking the main thread.

/cc @foolip for comments

ddorwin commented 7 years ago

FWIW, there is also this NOTE after the text that specifies when to call the Encrypted Block Encountered algorithm:

The above step provides flexibility for user agent implementations to perform decryption at any time after an encrypted block is encountered before it is needed for playback.

mwatson2 commented 7 years ago

I agree a change is required to make the intended timing of the "waitingforkey" event unambiguous. Perhaps there are steps that should be executed "when media from an encrypted block is required for presentation", and these steps would check whether the block has been decrypted and fire the event / suspend playback if it has not.

foolip commented 7 years ago

Is the timing of the "waitingforkey" event already close to when playback will stall, such that it can't be used to trigger key refresh and have smooth playback? If so, how about just firing when playback has actually stalled?

cpearce commented 7 years ago

@foolip I'd guess "waitingforkey" is close enough to the end of playable ranges that the network round trip time for a new license would be long enough that playback would be interrupted. I also expect that it's unlikely that script could figure out what key they need to request a license for without outside information, as we don't expose information about keys the CDM knows it needs.

If we tried to piggyback on the logic for firing "waiting" that's already in the HTMLMediaElement spec, we'd potentially not fire "waitingforkeys" in the case where we don't have a key for the first sample. Because we only fire the "waiting" event "if the readyState was HAVE_FUTURE_DATA or more, and the new ready state is HAVE_CURRENT_DATA or less".

ddorwin commented 7 years ago

Yes, waitingforkey is not intended to enable smooth playback. It is an indicator that something has gone wrong.

ddorwin commented 7 years ago

Is there a proposed fix for this issue?

cpearce commented 7 years ago

We could add a new behaviour change to the HTMLMediaElement in the "The following modifications are made to the behaviour of the HTMLMediaElement" section, something like:

When the user agent cannot advance the current playback position because the CDM does not have a usable key to decrypt the media data ahead of the current playback position, or if the user seeks to a position in the media data for which the CDM does not have a usable key, the user agent SHALL run the Waiting for Key algorithm.

We could then remove the run waiting for key step in the Attempt to Decrypt algorithm.

foolip commented 7 years ago

https://github.com/w3c/encrypted-media/issues/336#issuecomment-255912199 makes sense to me. Be sure to also define and test the order compared to other events that are fired when reaching the end of buffered/playable data.

ddorwin commented 7 years ago

Does @cpearce's proposal cover the initial playback case (load algorithm)? That would seem to be neither advancing (time 0 is the current time) nor seeking.

ddorwin commented 7 years ago

PR #346. I have attempted to cover the initial playback case as well.

We could probably handle the two conditions via some "is blocked on a key" state, but then we'd need to reset it on seeking... which led me to realize we don't set other variables on seeking (#347).

Once we fix #347, we might be able to replace the "because the CDM does not have a key that is usable to decrypt the media data" text in the PR with "and the internal waiting for key value is true." Does that miss any cases?

@foolip, what else do we need to define regarding reaching the end of buffered/playable data? Isn't that defined somehow by the algorithms? We should never reach the point of decrypting blocks if we're at the end, right?

ddorwin commented 7 years ago

I'll create another PR that uses the waiting for key value (after fixing #347). Any concerns about that approach?

foolip commented 7 years ago

@ddorwin, you'll need use enough words so that pedantic tests can be written based on that to cover the order and timing of at least "timeupdate", "waiting" and "waitingforkey" events, and if they are fired in the same task or in separate tasks. (You can tell the difference at microtask checkpoints, most easily tested with Promise.resolve().then(/* at a microtask checkpoint */) I think.)

foolip commented 7 years ago

In other words, saying "queue a task for fire x and y" vs. "queue a task to fire x and queue a task to fire y" is observably different.

ddorwin commented 7 years ago

PR #349 is the alternate PR that uses an internal variable. I don't think reusing waiting for key makes sense since it controls whether the event is fired, so I used a new variable.

ddorwin commented 7 years ago

@foolip, I think we have been careful to be mindful of tasks, though I have not thought about the implications for all such events. Both proposed PRs probably give some leeway in when exactly the Wait for Key algorithm is run (similar to the HTML spec), but what happens after that should be clearly defined.

One item that may be of interest: The Wait for Key algorithm is not run in a task and and runs the following steps synchronously in order: sets readyState, queues a task to fire the "waitingforkey" event, then suspends playback.

w3c / encrypted-media

waitingforkey as spec'd could fire at different times in different browsers #336