Renditions List proposal

gkatsev commented 1 year ago

Closes #1

gkatsev commented 1 year ago

just realized this is missing recommendations on how it should get populated from hls and dash content, like the other proposals do.

luwes commented 1 year ago

sorry for the late review, I'd scanned it over before but this all looks great to me! don't see any issues to not start development on this

heff commented 1 year ago

I've had a few conversations IRL on this, including philosophical ones about this project's purpose in general. Starting with the tl;dr:

Let's raise the prop level to videoEl.videoRenditions, instead of building on videoTracks. And have that list represent only the current videoTrack's available renditions.
Let's not introduce any changes to videoTracks, at least not yet
Let's not introduce an API to enabling/disabling tracks as options in the adaptive algorithm, at least not yet
Let's not introduce an API to the "active" rendition (currently playing vs. selected by user), at least not yet
Let's not introduce an API for externally adding/removing tracks. It should be a read-only list, at least to start.
Everything else stays the same (selectedIndex, onaddrendition, etc.) - @luwes tell me if I missed anything

The result then being a simple videoEl.videoRenditions list, where selectedIndex allows you to control which rendition is selected (or -1 for auto), and events that allow you to know when the renditions have changed and update the select menu appropriately.

On to the broader conversations...

There's one related topic about the sometimes competing goals of "solve APIs only for UI purposes (intentionally avoiding the deeper complexities of video)" and "create APIs that the W3C might some day adopt". Because sometimes the video element APIs support more than what the UI might require.

In this case, building on videoTracks is intuitive because of the natural data structure of video tracks and renditions, however quality selection UIs don't need that level of complexity. The UI doesn't need to know the available renditions of a non-active video track, or even what track they belong to. It just needs to know what renditions the viewer should be able to choose from, and what's currently selected. If it made no difference to the design, great! But the stacking events/listeners on videoTracks and videoRenditions makes the existing shape complicated to build on and implement.

After this conversation I think the W3C adoption of these APIs still needs to be a consideration, but only a secondary concern. Otherwise it can distract from the UI problems we're specifically trying to solve for, and open the APIs back up to video technology complexity we're trying to avoid. Ultimately meaning these specs take a lot longer to ship, and [I think] puts them at risk of not actually being adopted by other video player projects.

The other related topic was about where in the stack this API lives. Is the list a direct representation of the renditions available in the manifest, or does the list only represent what renditions are available after configuration decisions have been applied. i.e. is this API seeing what HLS.js see and in a place to tell HLS.js what renditions it should choose from, or is HLS.js telling this API what to list after it's imposed a resolution cap (for example). This is related to the feature of enabling/disabling renditions within the list, changing if a rendition can be chosen.

This is another situation where the UI doesn't need the feature of enabling/disabling tracks, so it's going deeper than the first goal and only there for the second goal. The main issue I see with this feature is it's imposing a configuration interface on players that they may not be able to support easily. For that reason alone I think we shouldn't include the feature in the first version of this spec at least. But I also think configuration of which renditions are available should just be up to players to provide.

Feel free to push back on any of that. But the general conclusions I'm coming to are:

We really need to focus on UI problems specifically, and sometimes that will mean abandoning our goal of W3C adoption. If we don't, I think shipping these specs will [continue to] take forever, and there will be real risk that no other players actually adopt them because of complexity.
We could do a better job of documenting the UI problems we're specifically solving for in these specs, to constrain the design and keep conversations focused.

mihar-22 commented 1 year ago

EDIT: Apologies, I know this isn't the place for this conversation. Happy to delete and move where appropriate.

Steve: We really need to focus on UI problems specifically, and sometimes that will mean abandoning our goal of W3C adoption. If we don't, I think shipping these specs will [continue to] take forever, and there will be real risk that no other players actually adopt them because of complexity.

I'm completely with you on this one Steve. W3C adoption is terribly slow (with good reason) but it gets in the way of innovation and exploration. I also think it's hard to standardize APIs we haven't explored pratically, things we just won't know without real world experimentation and user feedback.

A simple process in my mind would be come up with a list of media player UI/UX requirements which we can have conversations around, design API's solely around that with a backing spec around the intricate/important video details, and then ship solutions to these problems in some package that we can all install and start using (e.g., npm i media-extensions).

import { VideoRendtionList, ... } from 'media-extensions'

Note: Ideally the API should speak for itself and I think this will dramatically help with adoption. Do note, preferably any API shipped is based on contracts and doesn't require the HTMLMediaElement itself.

You're also on point about the goal difference. At the UI-layer we just want what gets the job done and leads to best UX, don't care about intricate details at that level. Extremely likely we won't expose the same primitives to our users. Browser APIs are generally shaped in the form of authoring, and as library authors we adapt them for consuming. There's always a gap. If we can just figure out what library authors need to build great UI/UX first, ship the code to start experimenting, we can then work backwards to primtives that might make sense internally. I think W3C should be the last step in the proposal process/stages.

Steve: Otherwise it can distract from the UI problems we're specifically trying to solve for, and open the APIs back up to video technology complexity we're trying to avoid. Ultimately meaning these specs take a lot longer to ship, and [I think] puts them at risk of not actually being adopted by other video player projects.

Agreed. My personal feedback is technical video conversations/specs are amazing and clearly required as they form the basis on why certain higher-level API decisions were made, but ultimately the main deliverable should be simplified and answer, "what player UI/UX challenge is this solving and how?" Preferably with code, docs, and examples. Priority is getting something we can start experimenting with out the door fast. I'm personally okay with mistakes, semver and proposal stages can iron that out.

Steve: We could do a better job of documenting the UI problems we're specifically solving for in these specs, to constrain the design and keep conversations focused.

Agreed. Top-level conversations should be centered around UI/UX. Easier for people to chime in and easier/faster adoption.

heff commented 1 year ago

Thanks for weighing in @mihar-22! And good reframing of the situation. I like the idea of W3C adoption being the last step, with real world implementation coming first, rather than projecting ahead.

We have a PR in media chrome exploring an implementation of this. Want to try it out in Vidstack and see if it works in that context?

video-dev / media-ui-extensions

Renditions List proposal #11