Closed cpearce closed 7 years ago
That is, Firefox and Chrome could end up firing "timeupdate" events after firing "waitingforkey", as we'd still be advancing the current playback position, and we won't be playing for the same duration of time as our decoded frame queues are different lengths. So we could end up firing different numbers of "timeupdate" events after the "waitingforkey" event.
I agree that it doesn't make sense to "be advancing the current playback position while the readyState is HAVE_METADATA." "Suspend[ing] playback" probably doesn't make sense either if there are available frames.
Perhaps we should reposition/refactor this algorithm to be based on decrypted frames being exhausted.
I'm not sure we can/want to block as suggested in this algorithm. Maybe that's okay since this is within the (effectively) decoding algorithms and thus not blocking the main thread.
/cc @foolip for comments
FWIW, there is also this NOTE after the text that specifies when to call the Encrypted Block Encountered algorithm:
The above step provides flexibility for user agent implementations to perform decryption at any time after an encrypted block is encountered before it is needed for playback.
I agree a change is required to make the intended timing of the "waitingforkey" event unambiguous. Perhaps there are steps that should be executed "when media from an encrypted block is required for presentation", and these steps would check whether the block has been decrypted and fire the event / suspend playback if it has not.
Is the timing of the "waitingforkey" event already close to when playback will stall, such that it can't be used to trigger key refresh and have smooth playback? If so, how about just firing when playback has actually stalled?
@foolip I'd guess "waitingforkey" is close enough to the end of playable ranges that the network round trip time for a new license would be long enough that playback would be interrupted. I also expect that it's unlikely that script could figure out what key they need to request a license for without outside information, as we don't expose information about keys the CDM knows it needs.
If we tried to piggyback on the logic for firing "waiting" that's already in the HTMLMediaElement spec, we'd potentially not fire "waitingforkeys" in the case where we don't have a key for the first sample. Because we only fire the "waiting" event "if the readyState was HAVE_FUTURE_DATA or more, and the new ready state is HAVE_CURRENT_DATA or less".
Yes, waitingforkey
is not intended to enable smooth playback. It is an indicator that something has gone wrong.
Is there a proposed fix for this issue?
We could add a new behaviour change to the HTMLMediaElement in the "The following modifications are made to the behaviour of the HTMLMediaElement" section, something like:
When the user agent cannot advance the current playback position because the CDM does not have a usable key to decrypt the media data ahead of the current playback position, or if the user seeks to a position in the media data for which the CDM does not have a usable key, the user agent SHALL run the Waiting for Key algorithm.
We could then remove the run waiting for key step in the Attempt to Decrypt algorithm.
https://github.com/w3c/encrypted-media/issues/336#issuecomment-255912199 makes sense to me. Be sure to also define and test the order compared to other events that are fired when reaching the end of buffered/playable data.
Does @cpearce's proposal cover the initial playback case (load algorithm)? That would seem to be neither advancing (time 0 is the current time) nor seeking.
PR #346. I have attempted to cover the initial playback case as well.
We could probably handle the two conditions via some "is blocked on a key" state, but then we'd need to reset it on seeking... which led me to realize we don't set other variables on seeking (#347).
Once we fix #347, we might be able to replace the "because the CDM does not have a key that is usable to decrypt the media data" text in the PR with "and the internal waiting for key value is true." Does that miss any cases?
@foolip, what else do we need to define regarding reaching the end of buffered/playable data? Isn't that defined somehow by the algorithms? We should never reach the point of decrypting blocks if we're at the end, right?
I'll create another PR that uses the waiting for key value (after fixing #347). Any concerns about that approach?
@ddorwin, you'll need use enough words so that pedantic tests can be written based on that to cover the order and timing of at least "timeupdate", "waiting" and "waitingforkey" events, and if they are fired in the same task or in separate tasks. (You can tell the difference at microtask checkpoints, most easily tested with Promise.resolve().then(/* at a microtask checkpoint */)
I think.)
In other words, saying "queue a task for fire x and y" vs. "queue a task to fire x and queue a task to fire y" is observably different.
PR #349 is the alternate PR that uses an internal variable. I don't think reusing waiting for key makes sense since it controls whether the event is fired, so I used a new variable.
@foolip, I think we have been careful to be mindful of tasks, though I have not thought about the implications for all such events. Both proposed PRs probably give some leeway in when exactly the Wait for Key algorithm is run (similar to the HTML spec), but what happens after that should be clearly defined.
One item that may be of interest: The Wait for Key algorithm is not run in a task and and runs the following steps synchronously in order: sets readyState
, queues a task to fire the "waitingforkey"
event, then suspends playback.
The "waitingforkey" event is spec'd in "7.3.4 Wait for Key" to be fired immediately once the algorithm is called. This algorithm is called from inside the "Attempt to decrypt" algorithm. I know that both Firefox and Chrome decode ahead of the current playback position in order to smooth over any irregularities in decoding timings. However, I note that Firefox (when decoding in software) decodes 10 frames ahead but Chrome decodes 4 ahead (IIRC). So if we fire "waitingforkey" as soon as we hit a frame that we can't decode, we may end up still playing for 10 frames or so in Firefox and 4 frames in Chrome. This means we could also dispatch events such as "timeupdate" after the "waitingforkey". Confusingly, we'd be advancing the current playback position while the readyState is HAVE_METADATA.
I recall that the "waitingforkey" event is primarily as an indicator for developers that playback has stalled.
So I propose we change the "Wait for Key" algorithm to have an extra step, inserted before step 4 which enqueues a task to dispatch the event:
This would enforce that browsers fire "waitingforkey" at a consistent time.