w3c / mediacapture-record

MediaStream Recording
https://w3c.github.io/mediacapture-record/
Other
104 stars 21 forks source link

Add replaceStream to MediaRecorder #186

Closed henbos closed 1 year ago

henbos commented 4 years ago

Fixes #167


Preview | Diff

jan-ivar commented 4 years ago

cc @Pehrsons

jan-ivar commented 4 years ago

Discussed this with @Pehrsons and we have some concerns with this API.

While browsers today appear limited to recording only one video track and one audio track, there's no such limitation in the spec, so we think we need to consider how to handle several tracks.

Say someone records a stream like this:

const stream = new MediaStream([videoTrack, englishAudioTrack, frenchAudioTrack]);
const recorder = new MediaRecorder(stream);
recorder.start();

How would they replace just the french audio? Tracks in a MediaStream unfortunately are unordered.

So we'd like to propose this instead:

await recorder.replaceTrack(frenchAudioTrack, directorsCommentaryTrack);

In addition, we propose to no longer stop the recorder if a stream's track-set changes, because it no longer makes sense to react to those. Instead, we'd copy the track-set in start(), and only stop on the first track ended.

If the tracks' isolation properties differ, replaceTrack would reject with SecurityError.

Ifstop() followed by start() is called again then the stream can be examined anew.

guest271314 commented 4 years ago

@jan-ivar

How would they replace just the french audio?

That is in fact a consideration that suggested be included in this PR at https://github.com/w3c/mediacapture-record/issues/167#issuecomment-531060600. In brief, by providing a means to set the order the tracks. Further, by providing a means to parse media files.

So we'd like to propose this instead:

await recorder.replaceTrack(frenchAudioTrack, directorsCommentaryTrack);

"instead" is not necessary language or intent. There is no reason that both replaceStream() and replaceTrack() cannot be specified and implemented.

In addition, we propose to no longer stop the recorder if a stream's track-set changes, because it no longer makes sense to react to those.

That essentially would resolve the issue and make the use case viable.

Instead, we'd copy the track-set in start(), and only stop on the first track ended.

That would be an issue for interoperability re Chromium, Chrome https://github.com/w3c/mediacapture-fromelement/issues/78.

If the tracks' isolation properties differ, replaceTrack would reject with SecurityError.

AFAIK isolationchange is not implement, correct? Do you mean unmute event?

guest271314 commented 4 years ago

@jan-ivar Re "instead" see https://github.com/w3c/mediacapture-record/issues/167#issuecomment-526226683 where @henbos mentioned specifying both replaceStream() and replaceTrack().

Ideally the Media Capture and Streams (main) will eventually abandon the concept of tracks not being ordered https://github.com/w3c/mediacapture-main/issues/611, https://github.com/w3c/mediacapture-record/issues/179 due to the fact of the issues arising when post-production on the WebM file(s) are performed.

For this input (some files may have audio track without video track or video track without audio track)

https://upload.wikimedia.org/wikipedia/commons/6/6e/Micronesia_National_Anthem.ogg#t=0,2 https://upload.wikimedia.org/wikipedia/commons/a/a4/Xacti-AC8EX-Sample_video-001.ogv#t=0,4 https://mirrors.creativecommons.org/movingimages/webm/ScienceCommonsJesseDylan_240p.webm#t=10,20 https://mirrors.creativecommons.org/movingimages/webm/ASharedCulture_480p.webm#t=22,26 https://nickdesaulniers.github.io/netfix/demo/frag_bunny.mp4#t=55,60 https://raw.githubusercontent.com/w3c/web-platform-tests/master/media-source/mp4/test.mp4#t=0,5 https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4#t=0,5 https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerJoyrides.mp4#t=0,5 https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/ForBiggerMeltdowns.mp4#t=0,6

the required mapping to merge the tracks, after creating the corresponding potentially omitted video or audio track

[
  {
    "audio": 1,
    "video": 0
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 1,
    "video": 0
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 0,
    "video": 1
  },
  {
    "audio": 0,
    "video": 1
  }
]

though is not guranteed to be in that same order for each run of the code. To merge with mkvmerge --append-to mapping is

1:0:0:1,1:1:0:0,2:1:1:0,2:0:1:1,3:0:2:1,3:1:2:0,4:0:3:0,4:1:3:1,5:0:4:0,5:1:4:1,6:0:5:0,6:1:5:1,7:0:6:0,7:1:6:1,8:0:7:0,8:1:7:1.

Result https://plnkr.co/edit/8J61Rw?p=info.

Even when WebM file produced by Chromium and Mozilla browsers have the same track order they still cannot be merged; an error is thrown, due to the vastly different parameters passed to the encoder.

Thus, there is still must ground to be covered for any sort of interoperability, if that term or concept is in fact being considered at all during this specification write-up. Setting consistent track order should eventually be addressed. There is no compelling reason to not have consistent track order, especially within the purview of MediaRecorder.

jan-ivar commented 4 years ago

https://github.com/w3c/mediacapture-main/issues/611 was closed. I have no hopes we'll define track order in MediaStream.

MediaRecorder could do what web audio does to make order deterministic, though uuid generation being random, it's not super practical.

I see no value in replaceStream if we have replaceTrack—the former can be implemented by the latter. In general we've been moving away from stream APIs, to avoid action at a distance.

AFAIK isolationchange is not implement, correct? Do you mean unmute event?

No, even though no browser currently exposes that or the isolated attribute, tracks still have "isolation properties" in all browsers, which can be observed e.g. with cross-origin content. We have to guard cross-origin content.

guest271314 commented 4 years ago

@jan-ivar

w3c/mediacapture-main#611 was closed. I have no hopes we'll define track order in MediaStream.

Do not entertain hope at all here. As a practical matter must defer to your experience and judgment re MediaStream specification.

I see no value in replaceStream if we have replaceTrack—the former can be implemented by the latter. In general we've been moving away from stream APIs, to avoid action at a distance.

Then define replaceTrack for MediaRecorder yourself. The original proposal is for replaceTrack as the mechanism that you apparently single-handedly pushed through WebRTC specification is well-suited to achieve the requirement of the use case, as demonstrated at at least four branches at code to work around current MediaRecorder implementation https://github.com/guest271314/MediaFragmentRecorder/branches/all?utf8=✓&query=webrtc-replacetrack, and several issues filed at various specification repositories at GitHub, et al.

At the front-end, since @henbos evidently has interest in extending MediaRecorder with replaceStream would rather have that specified than remaining with the same specification. Do gather your point.

Even if replaceStream is specified that should not stop you and interested parties from specifying replaceTrack. The proof of concept will be vetted by the language of the specification and borne out by the implementations therefrom, to the degree that implementers actually follow specifications until they decide to do otherwise. AFAICT no individual or entity is or can stop you from filing a PR to add replaceTrack to MediaRecorder right now, correct?

Should @henbos be asked to substitute replaceTrack for replaceStream and abandon their own proposal for the reasons you raised, for the purpose of having a uniform, cohesive concept moving forward?

How do you suggest to proceed?

Pehrsons commented 4 years ago

Discussed this with @Pehrsons and we have some concerns with this API.

While browsers today appear limited to recording only one video track and one audio track, there's no such limitation in the spec, so we think we need to consider how to handle several tracks.

Say someone records a stream like this:

const stream = new MediaStream([videoTrack, englishAudioTrack, frenchAudioTrack]);
const recorder = new MediaRecorder(stream);
recorder.start();

How would they replace just the french audio? Tracks in a MediaStream unfortunately are unordered.

So we'd like to propose this instead:

await recorder.replaceTrack(frenchAudioTrack, directorsCommentaryTrack);

I'll just comment on this that one can trivially polyfill the functionality of replaceStream with replaceTrack (modulo replacing the recorder's stream attribute). But polyfilling replaceTrack with replaceStream becomes hacky as one would have to replace one track at a time.

In addition, we propose to no longer stop the recorder if a stream's track-set changes, because it no longer makes sense to react to those. Instead, we'd copy the track-set in start(), and only stop on the first track ended.

If the tracks' isolation properties differ, replaceTrack would reject with SecurityError.

Ifstop() followed by start() is called again then the stream can be examined anew.

I thought some more about the need to for the replace method to return a promise. The only argument I've heard in favor of returning a promise is that then you know you can dispose of the old tracks without risking ending them in the recording before the new ones take effect.

I think this is unnecessary. The MediaRecorder could guarantee that recorder.replaceTrack(oldTrack, newTrack); oldTrack.stop(); replaces the track successfully as long as the track is not ended.

To stretch things a bit, the recorder could guarantee that the track is replaced successfully as long as the recorder is not in the inactive state.

This would help applications handle concatenating/swapping tracks when the application itself is not in charge of the lifetime of those tracks, like HTMLMediaElement's captureStream().

In fact, if one of two tracks the recorder is recording has ended, there's nothing to say in the spec that the recording of the ended track (let's call it the output track) has also ended. The spec says

If all recorded tracks become ended, then stop gathering data, and queue a task (...) Inactivate the recorder

(emphasis mine)

Though this highlights a race that would need to be fixed, in the case of:

const rec = new MediaRecorder(stream);
rec.start();
await new Promise(r => rec.onstart);
stream.getTracks().forEach(t => t.stop()); // stops gathering data and queues a task to inactivate
rec.replaceTrack(stream.getTracks()[0], newTrack); // rec.state is not inactive yet -- should this throw? -- or abort the task queued above?

Perhaps that'd be the reason we need to return a promise.

jan-ivar commented 4 years ago

@pehrsons If we can make those guarantees then I agree we shouldn't return a promise. Only if the asynchronous part can fail would a promise be needed. I don't think that's the case here.

replaceTrack should probably throw if either track is ended. That would avoid the race you mention.

jan-ivar commented 4 years ago

Also, with replaceTrack there's inherently no way to add additional tracks to a recording, whereas with replaceStream you'd be forgiven for not knowing that this limitation exists (it's in the prose).

Pehrsons commented 4 years ago

@Pehrsons If we can make those guarantees then I agree we shouldn't return a promise. Only if the asynchronous part can fail would a promise be needed. I don't think that's the case here.

replaceTrack should probably throw if either track is ended. That would avoid the race you mention.

@jan-ivar Personally I like that it solved the track-lifetime problem more than I'd be annoyed by having to solve the race.

Consider recording a stream from HTMLMediaElement.captureStream(). You could pipe an audio track through webaudio to be in full control of its lifetime. Then even when a video track from the media element ended before you managed to replace it with another (which if you managed to do it in time, would chop off a bit of the track's end), you can still do that after-the-fact. Or with a gUM track in case a user killed it through the browser's chrome.

guest271314 commented 4 years ago

@jan-ivar

Also, with replaceTrack there's inherently no way to add additional tracks to a recording, whereas with replaceStream you'd be forgiven for not knowing that this limitation exists (it's in the prose).

If gather the statement context correctly, yes there is. Given the case of recording multiple tracks at a single <video> element from HTMLVideElement.captureStream() execute captureStream() at a before any playback begins, when src of <video> is changed the MediaStreamTracks from the new media source are added to the original MediaStream from captureStream() and addtrack event is dispatched, where within the event handler replaceTrack() can be executed for each audio or video track, a 1:1 swap for new track:original track N times, where the MediaStream contains at most two tracks throughout the process https://github.com/guest271314/MediaFragmentRecorder/blob/webrtc-replacetrack-htmlmediaelement-capturestream-addtrack/MediaFragmentRecorder.html.

guest271314 commented 4 years ago

@Pehrsons

Consider recording a stream from HTMLMediaElement.captureStream(). You could pipe an audio track through webaudio to be in full control of its lifetime. Then even when a video track from the media element ended before you managed to replace it with another (which if you managed to do it in time, would chop off a bit of the track's end), you can still do that after-the-fact. Or with a gUM track in case a user killed it through the browser's chrome.

That approach is possible with both audio and video. The concept of this code is an infinite media stream initialized to #000000 video frames and audio silence (utilizing AudioWorklet, Worker, and canvas.captureStream()) where MediaStreamTracks can be added to the MediaStream at any time. When there is a "gap" between track additions or swaps the initial audio or video track is used for that respective track data. The entire procedure should be capable of being recorded by MediaRecorder with the separate capability to seek backwards or forwards for infinity given "an idealized computer with unlimited resources" (for brevity citing Wikipedia), though during testing the code the tab inevitably crashes (there should be some means to reduce or memory usage for the APIs used) https://plnkr.co/edit/Axkb8s?p=info.

The original issue proposing using the same or similar language to WebRTC RTCRtpSender.replaceTrack(). Ideally, neither a track not currently having data, nor the source <video> pausing, nor the track or MediaStreamTrack ending should be the cause of MediaRecorder stopping. #000000 video frames and audio silence can be recorded by default until the user explicitly executes stop().

Pehrsons commented 4 years ago

@jan-ivar

Also, with replaceTrack there's inherently no way to add additional tracks to a recording, whereas with replaceStream you'd be forgiven for not knowing that this limitation exists (it's in the prose).

If gather the statement context correctly, yes there is.

I think with "Add additional tracks" he means going from recording two tracks to three, or from 20 to 25, or the like. That would seem possible from just looking at the replaceStream API, even though it's not.

The original issue proposing using the same or similar language to WebRTC RTCRtpSender.replaceTrack(). Ideally, neither a track not currently having data, nor the source <video> pausing, nor the track or MediaStreamTrack ending should be the cause of MediaRecorder stopping. #000000 video frames and audio silence can be recorded by default until the user explicitly executes stop().

The spec says to end the recording when all tracks have ended. Changing that would break existing applications that depend on this feature so it's not really an option.

alvestrand commented 1 year ago

Closing because of lack of developer interest.

realies commented 6 days ago

I would prefer not to use a RTCPeerConnection workaround. @alvestrand, can we reopen the issue, please?