w3c / mediacapture-main

Media Capture and Streams specification (aka getUserMedia)
https://w3c.github.io/mediacapture-main/
Other
126 stars 61 forks source link

How to handle changes to the set of tracks in a MediaStream assigned to an element via the srcObject attribute #453

Open guidou opened 7 years ago

guidou commented 7 years ago

The spec says:

The User Agent must always play the current data from the MediaStream and must not buffer.

However, the exact mechanism for this is not explained.

I tested adding and removing tracks to a stream with Edge, and it apparently runs the load algorithm every time the set of tracks changes, since it fires the emptied and loadstart events on track addition/removal.

On Firefox, adding or removing a track results in it being rendered by the element, but the full load algorithm does not run, since loadstart/emptied are not fired.

On Chromium, both adding and removing a track are ignored by the element. I was actually trying to fix this bug on Chromium and initially went the Edge way, but @foolip suggested that the Firefox approach might be more in line with the way the media element is supposed to work.

Should a specific approach be specified here, or are both approaches equally valid?

foolip commented 7 years ago

@ShijunS @jan-ivar @youennf, can you comment or find others for Edge/Gecko/WebKit who could?

stefhak commented 7 years ago

Thanks @guidou for starting to test this. I did write some of the spec parts, but I do think there may be gaps and unclear things there (not to mention that the media element and the track lists seemed to move a bit as well).

My intention at the time of writing was that adding and removing tracks would be equivalent with how it works if a file is allocated (using src =) to a video element, and tracks come and go there. There is some text in the html spec about audio/videoTrackLists being mutated.

Pehrsons commented 7 years ago

Good that you bring this up @guidou. When doing some work in this area for Gecko I found the most sensible approach was to:

That said, I am not sure Gecko's implementation fully follows this, though it's probably close. It does help that we don't support AudioTrack and VideoTrack (for now prefed off), or we'd have to deal with issues such as:

The only text I find on mutating the TrackLists is on when a track ends, but that doesn't mention whether another track should be enabled or selected. Perhaps I missed something?

@stefhak mentions tracks coming and going when a media element plays a file but to my understanding all tracks of a file src are known when metadata has been loaded, and this is the only case that the spec supports.

I believe that MSE supports tracks that come and go however, so it would be interesting to know how it works there.

ShijunS commented 7 years ago

The MSE spec allows some flexibility.

For example, a user agent MAY throw a QuotaExceededError exception if the media element has reached the HAVE_METADATA readyState. This can occur if the user agent's media engine does not support adding more tracks during playback.

In case user agent allows new SourceBuffer or track to be added/removed, it'd make sense for the user agent to go through the load algorithm and trigger corresponding events. Apps can listen to the events and update the app UI accordingly. This seems a good topic to bring to the MSE folks. It'd be ideal for MSE and MediaStream objects to have consistent behavior rather than making separate assumptions.

youennf commented 7 years ago

In WebKit, there should be no difference between MSE, gum, and peer connection media streams. Implementation should react to addition/removal of tracks as described by @Pehrsons.

ShijunS commented 7 years ago

That sounds the right behavior given the user agent will fire loadedmetadata and loadeddata events, etc. due to the track change.

stefhak commented 7 years ago

@Pehrsons : I was thinking about the text saying An AudioTrackList object represents a dynamic list of zero or more audio tracks... (and for video) in https://www.w3.org/TR/html51/semantics-embedded-content.html#audiotracklist-and-videotracklist-objects. I did not know all tracks had to be known from the start.

Anyway, that text should IMO be valid also if the source is a srcObject or a SourceBuffer.

Pehrsons commented 7 years ago

Ah, it does actually mention tracks being added or removed dynamically. That's good. However, I haven't found any language on how to handle such events.

I did not know all tracks had to be known from the start.

I don't know whether or not this is a strict requirement (it appears not, per above), but for playing a simple file this is true in practice - from parsing container metadata.

stefhak commented 7 years ago

@ShijunS good points in https://github.com/w3c/mediacapture-main/issues/453#issuecomment-302138343. We should probably have something similar (i.e. allow the html media element to throw if it can't handle one more track).

That though brings up the question if we should have similar errors on recorder, peer-connection etc.

alvestrand commented 3 years ago

At this stage, we should probably document the current state of things (if it's consistent) and leave it at that. @guidou @Pehrsons ?

guidou commented 3 years ago

I ran a test with Firefox, Safari and Chrome to see how they're handling this nowadays.

The test consists in the following steps:

  1. Assign a MediaStream with an audio track and no video track to a video element.
  2. Add a video track to the MediaStream assigned in step 1.
  3. Remove the video track from the MediaStream assigned in step 1.
  4. Add the same video track again to the MediaStream assigned in step 1.

Firefox and Chrome behave the same. They don't run the full load algorithm and fire the "resize", "canplay" and "playing" events on step 2. They don't fire any events on step 3 and 4. If the track added in step 4 has a different size, "resize" is fired.

Safari fires "resize", "loadedmetadata" and "canplay" on step 2; fires "resize" on step 3; and fires "resize", "loadedmetadata" and "canplay" again on step 4. I guess the reason for the resize events in step 3 and 4 for Safari is that it internally changes the size on track removal while Chrome and Firefox do not.

Given that Firefox and Chrome coincide and that seems to match the media element spec, I think we should considered that behavior as correct. Not sure if that requires changes to the latest version of the spec.

eric-carlson commented 3 years ago

I ran a test with Firefox, Safari and Chrome to see how they're handling this nowadays.

The test consists in the following steps:

  1. Assign a MediaStream with an audio track and no video track to a video element.
  2. Add a video track to the MediaStream assigned in step 1.
  3. Remove the video track from the MediaStream assigned in step 1.
  4. Add the same video track again to the MediaStream assigned in step 1.

Firefox and Chrome behave the same. They don't run the full load algorithm and fire the "resize", "canplay" and "playing" events on step 2. They don't fire any events on step 3 and 4. If the track added in step 4 has a different size, "resize" is fired.

This seems wrong. The media element spec says the 'resize' event should be fired:

Whenever the intrinsic width or intrinsic height of the video changes (including, for example, because the selected video track was changed) ...

I would think that "the selected video track was changed" includes removing an active video track.

Safari fires ... fires "resize", "loadedmetadata" and "canplay" again on step 4. I guess the reason for the resize events in step 3 and 4 for Safari is that it internally changes the size on track removal while Chrome and Firefox do not.

Removing the video track shouldn't cause the readyState to drop below HAVE_METADATA, so firing 'loadedmetadata' and 'canplay' also seems wrong. I filed https://bugs.webkit.org/show_bug.cgi?id=225612 to investigate this.