w3c / media-source

Media Source Extensions
https://w3c.github.io/media-source/
Other
267 stars 57 forks source link

Delayed enabling or adding of tracks/streams #210

Closed DanielBaulig closed 6 years ago

DanielBaulig commented 6 years ago

We are currently exploring starting video playback without having to load audio when the video is muted and only adding an audio track / stream to the video later when it is unmuted. The goal is to use less data and provide a quicker and smoother playback experience especially in low bandwidth and high latency environments.

We explored three potential options, but all of them run into restrictions imposed by the specification or implementations. Let me outline the three options that we explored first and what problems they have and then ask some questions:

1) Start playing with a single video SourceBuffer and add an additional audio SourceBuffer later when needed. This is possible in theory, but the specification very clearly states that the UA may throw an exception if the media element has reached a HAVE_METADATA state or if the UA does not support adding additional tracks during playback. In practice, relevant UAs will throw an exception if we attempt this.

2) Create both audio and video SourceBuffer, but set the audio SourceBuffer to a mode of 'sequential' and repeatedly append a silent dummy audio segment to fill the audio buffer with inaudible audio data. The idea was to fetch actual audio data and switch the audio SourceBuffer back to 'segment' and append the actual audio data, once the video is unmuted. However, switching from 'sequential' to 'segment' will throw an exception according to spec

3) Use the enabled attribute specified as part of AudioTracks to disable the only audio track in the audio SourceBuffer and only re-enable it once the video is unmuted. In theory, to our understanding, this should be a spec compliant way of achieving what we are looking to do, but in practice, relevant UAs do not implement the Video- and/or AudioTracks. From a chat with some UA vendors in the past it sounded like there are no concrete plans across vendors to actually implement the Tracks APIs.

Hope people can share some thoughts and insight. Thanks!

wolenetz commented 6 years ago

Why does the specification not allow switching back to 'segment' mode once the SourceBuffer was in 'sequence' mode?

The spec allows this for all bytestreams except audio/mpeg and audio/aac, which auto-generate timestamps for appended coded frames and only works in 'sequence' mode (caveat Chrome gives same behavior as sequence mode but allows setting of 'segments' mode on such a SourceBuffer -- known bug https://crbug.com/607372).

It could either be that you're using such bytestreams, or that you're attempting to set the 'mode' attribute on the SourceBuffer when it is in a state that disallows that operation (see the various conditions in the spec @ https://www.w3.org/TR/media-source/#dom-sourcebuffer-mode).

If the problem is that you need to align real (non-silent dummy) audio to begin at a particular time, consider this possible approach for the audio SourceBuffer:

  1. sourceBuffer.abort() (https://www.w3.org/TR/media-source/#dom-sourcebuffer-abort)
  2. sourceBuffer.mode = 'sequence'
  3. sourceBuffer.timestampOffset=where you want the next-appended real audio media to begin in the media timeline
  4. sourceBuffer.appendBuffer(non-dummy audio).
  • Why does the specification allow UAs to throw an exception when adding a SourceBuffer during playback? Is this something that is practically hard or maybe even impossible to implement?

At time of driving to REC, it was practically hard; maybe impossible for some implementors.

  • Do browser vendors indeed not intent to implement the AudioTracks API?

Chrome is continually improving this portion of its implementation; I expect Chrome to eventually get this support (behind flag currently) shipped.

  • Are there any other ways to achieve what we would like to do?

Options to address this might be available if we incubate appropriate support for https://github.com/w3c/media-source/issues/160. e.g. let video play even if there is no corresponding buffered audio, and vice-versa, with app-direction on tolerances and policy for jumping (or not) over a gap in the union of the selected/active tracks' buffered ranges. This is a frequently requested vNext feature.

  • do people on this list see value in being able to achieve what we are trying to do?

I do :) Reducing network bandwidth for muted video playback should help API users and their users :)

-edited for clarity in steps 3 and 4, above.

DanielBaulig commented 6 years ago

@wolenetz Thanks a lot for your response. We had actually not practically tested 2) but our conclusion was based off of a misunderstanding of the spec specifically

If generate timestamps flag equals true and new mode equals "segments", then throw a TypeError exception and abort these steps.

We incorrectly assumed the generate timestamps flag would be set as a result of setting mode to 'sequence'. I just successfully verified that approach 2 actually works.

Implementation of a feature similar to what #160 proposes would make the implementation of this obviously significantly easier.

I think we can close this issue for now. Thank you!