Closed cjpillsbury closed 1 year ago
This is a great writeup. 👍 Love the additions to HLS/DASH to support this cleanly. I don't understand all the implications of those specifically, but I'm sure others can weigh in.
Since this is the media-ui-extensions I think it's worth laying out the UI problems that are being solve more. It's gonna be hard for others to really evaluate the API without that clear context. As a summary pass:
seekable.end(0)
seekable.end(seekable.length-1)
, right? Not sure if it's actually possible for seeking but it definitely matters for buffered
.
liveWindowOffset
Definition and name feel good to me. 👍
The most accurate/verbose name might be liveEdgeWindowOffset, right? In order to avoid confusion with anything else that might be considered a "live window". I feel like we should never refer to the DVR window specifically as "live". It's intentionally "(R)ecorded", not live, once you get behind the live edge. i.e. "Live + DVR" feels more accurate than "Live DVR".
Is this repurposing of the event acceptable?
If we can't point to any real reasons why this number might change otherwise, it feels good to try to bundle it as a starting point at least. Either that or we just say that every new state gets its own change event, and be done with it. I could go either way. The latter would remove friction in any specific independent proposal.
I feel like we should never refer to the DVR window specifically as "live". It's intentionally "(R)ecorded", not live, once you get behind the live edge. i.e. "Live + DVR" feels more accurate than "Live DVR".
Not sure what you mean here. Per your suggested name here https://github.com/video-dev/media-ui-extensions/issues/4#issuecomment-1344924246, our current mostly settled proposal on modeling DVR will rely on a property called targetLiveWindow
. Is this you changing your mind? Am I misunderstanding something here?
seekable.end(0)
seekable.end(seekable.length-1)
, right? Not sure if it's actually possible for seeking but it definitely matters forbuffered
.
Yes, it should be seekable.end(seekable.length-1)
, though, in practice they're going to be equivalent in most implementations for live streams.
seekable.end(seekable.length-1)
@heff - Per @gkatsev's callout, I believe this will always be identical to seekable.end(0)
in browser implementations, but you're right, we might as well avoid unnecessary presumptions here and always refer to it as seekable.end(seekable.length-1)
Is this you changing your mind? Am I misunderstanding something here?
No, you're right to be confused. :) From this context targetLiveWindow
now sounds more misinterpretable. We're at least clearly using 'liveWindow' to mean two different things between the proposals now, and that's not great. I don't think we have to go change targetLiveWindow, but if we don't I'd lean towards something like liveEdgeOffset
here instead. An alt for targetLiveWindow
otherwise might be targetSeekableWindow
. Open to either path, we should just avoid the double meaning.
we should just avoid the double meaning.
Agree 💯. I'm going back and forth on your rename proposals. As each hints at, the problem is both scenarios are about "windows" and both are related to live. One is the "live seekable window;" the other is the "live edge window." They're also both offsets. Since Names Are Hard™️, I'm leaning towards liveEdgeOffset
. It unfortunately looses some context by dropping "window" that may introduce some ambiguity/confusion, but I think that'll be true for any renaming.
A couple of additional thoughts:
targetLiveSeekableWindow
is pretty darn clear(er).With that I like targetLiveSeekableDuration
. Also open to Window.
For the live edge, would it be better to do liveEdgeStart
? Feels like the most common operation is going to be:
if (currenTime > seekableEnd - liveEdgeWindowOffset) {
// show red light
}
When it could just be:
if (currenTime > liveEdgeStart) {
// show red light
}
With that we could lean on progress
or durationchange
events for updates. Or just timeupdate would even be fine.
I might be missing something but why not just HTMLMediaElement.getLiveSeekableRange()
?
that covers both #4 and #6 in one familiar API.
it also a bit similar to how HTMLMediaElement and MediaSource both have duration
I def also am on board with that the naming should be close to that LiveSeekable
naming
I might be missing something but why not just HTMLMediaElement.getLiveSeekableRange() ?
@luwes responded in the PR to keep the conversation there, both in comments and by updating the proposal to hopefully add some clarity. The short version:
There are actually two distinct "live windows" we're modeling in #4 vs. here.
okay I thought somehow that this is true
start livestream time -> liveSeekableStart -> liveSeekableEnd -> real seekable.end(seekable.length-1)
and the proposed liveEdgeOffset
= seekable.end(seekable.length-1) - liveSeekableEnd
is this not the case?
@luwes No that's not quite right. Check out the diagram I added here https://github.com/video-dev/media-ui-extensions/pull/7/files?short_path=6415912#diff-6415912cbdb551127eb5975514c274cb87904befd9ca77ec25808f682ab492d7 ("Diagram with HLS reference values for context") and let me know if that clears things up. Also, if you could, let's move the conversation to the PR to try to follow Gary's process.
Closing this Issue per our discussed process to avoid multi-channel conversations. Can re-open if corresponding proposal PR is rejected.
Overview & Purpose
The Problem
For live/"DVR" content, it's common to have some indication as to whether or not they are currently playing "at the live edge". However, due to nature of HTTP Adaptive Streaming (HAS), the live edge cannot be represented as a simple point/moment in the media's timeline. This is for a few reasons:
seekable.end(0)
of a media element, which can then be used as a reference for any other live edge window computation.(Visual representation may help here)
A concrete sub-optimal (not worst case) but in spec example - HLS:
Let's say a client player fetches a live HLS media playlist just before the server is about to update it with the following values:
The server then ends up updating the playlist with two larger-duration segments (in spec and happens under sub-optimal but not unheard of conditions) before the client re-requests the playlist after 4.99 seconds (the minimum amount of time the player must wait) and continues re-fetching the available segments, with an updated playlist of:
In this example, playback started 5.46 seconds behind the computed "LIVE EDGE" and, after a single reload of the playlist, ended up 11.45 seconds behind the next computed "LIVE EDGE" without any stalls/rebuffering. Note that, even in this example, we do not account for round trip times (RTT) for fetches, time to parse playlists, times to buffer segments, initial seeking of the player's playhead/
currentTime
, and the like. Note also that, even without those considerations, the playhead still ends up > 2 * TARGETDURATION behind the "LIVE EDGE".The solution
Since this information can be derived from a media element's "playback engine"/by parsing the relevant playlists or manifest, the extended media element should have an API to advertise what the live edge window is for a given live HAS media source. Call this the "live window offset"
Additionally, due to consideration (3), above, we should treat the
seekable.end(0)
as the end time of a live stream accounting for the per-specification "holdback" or "delay".Proposed API
Constrained meaning of
seekable.end(0)
as "live edge" (with HOLD-BACK/etc) for HASTo account for the distinction between the live edge duration of the media stream as advertised by the playlist or manifest vs. the latest time a client player should try to play, based on per-specification rules and additional information also provided in the playlist or manifest, extended media elements SHOULD set the
seekable.end(0)
value to account for this offset. This shall be assumed for all computations of the "live edge window", whereseekable.end(0)
will be the presumed "end" of the window/range, already taking into account the aforementioned offset. With these offsets presumed,seekable.end(0)
may be treated as synonymous with a client player's "live edge" and these terms should be treated as interchangeable in this initial proposal.For RFC8216bis12 (aka HLS)
seekable.end(0)
should be based on the inferred or explicitHOLD-BACK
attribute value, where:seekable.end(0)
should be based on the explicitPART-HOLD-BACK
(REQUIRED) attribute value, where:For ISO/IEC 23009-1 (aka "MPEG-DASH")
seekable.end(0)
should be based on the explicitMPD@suggestedPresentationDelay
(OPTIONAL) attribute, when present, otherwise it may be whatever the client chooses based on its implementation rules. Per the spec:(NOTE: there may be additional suggestions/recommendations available via the DASH IOP)
seekable.end(0)
should be based on theServiceDescription -> Latency@target
attribute. Note that this value is an offset not of the manifest timeline, but rather of the (presumed NTP or similarly synchronized) wallclock time. Per the spec:(NOTE: This implies that the value could change marginally over time based on precision and other wallclock time updates based on the runtime environment. However, since these differences should be minor, it's likely fine to treat this value as static for the case of this document and can likely be implemented as such in an extended media element)
liveWindowOffset
Definition
An offset or delta from the "live edge"/
seekable.end(0)
. An extended media element is playing "in the live window" iff:mediaEl.currentTime > (mediaEl.seekable.end(0) - mediaEl.liveWindowOffset
).Possible values
undefined
- UnimplementedNaN
- "unknown" or "inapplicable" (e.g. forstreamType = "on-demand"
)0 <= x <= Number.MAX_SAFE_INTEGER
- known stable value for current streamRecommended computation for RFC8216bis12 (aka HLS)
liveWindowOffset = 3 * EXT-X-TARGETDURATION
Note that this is a cautious computation. In many stream + playback scenarios,
2 * EXT-X-TARGETDURATION
will likely be sufficient. However, with this less cautious value, there may be edge cases where standard playback will "hop in and out of the live edge," so recommending the more cautious value here.liveWindowOffset = 2 * PART-TARGET
Unlike "standard" segments (
#EXTINF
s), parts' durations must be <=#EXT-X-PART-INF:PART-TARGET
(without rounding). Also unlike "standard," HLS servers must add new partial segments to playlists within 1 (instead of 1.5) Part Target Duration after it added the previous Partial Segment. This means that, even under sub-optimal conditions, low latency HLS should end up with a much smallerliveWindowOffset
.Recommended computation for ISO/IEC 23009-1 (aka "MPEG-DASH")
TBD
Open Questions
targetLiveWindow
. Since this value represents a window for the "live edge" and not for "available live content to seek through/play", having both refer to the "live window" will likely be confusing. In the current related preliminary implementation in Media Chrome, we refer to the related attribute as thelivethreshold
. Should that be the name here as well? Do we want the name to try to capture the fact that this is an "offset" value from the "live edge"/seekable.end(0)
?livewindowoffsetchange
event. While we cannot likely rely on any of the built inHTMLMediaElement
events, we should be able to guarantee computation of the relevant values before dispatching thestreamtypechange
event, as documented in https://github.com/video-dev/media-ui-extensions/issues/3. Is this repurposing of the event acceptable? Should we consider a more generic event name that more clearly relates to states announced for stream type, DVR, live edge window offset, and potentially additional future properties/state?