Closed eladalon1983 closed 1 month ago
To recap my view from today's editors' meeting, I see 3 things to decide on (with my preferred answers):
getDisplayMedia({audio: true, appAssistedSurfaceSwitching: "include"})
)sourceswitch
event)event.preventDefault()
)
- app decision-point (late aka point-of-use through event.preventDefault())
I'd still like to see an example of an application that benefits of this possibility.
I'd still like to see an example of an application that benefits of this possibility.
In today's meeting the early decision example shown was:
getDisplayMedia({appAssistedSurfaceSwitching: "include", …})
controller.onsourceswitch = event => {
video.srcObject = event.stream;
};
But this will glitch in all browsers, even for same-type switching, because it reruns the media element load algorithm.
A late decision seems inherently needed to fix this glitch for the subset of same-type switching. E.g.
controller.onsourceswitch = event => {
if (!compatibleTypes(video.srcObject, event.stream)) {
event.preventDefault(); // Use switch-track model
video.srcObject = event.stream;
}
};
Glitching may similarly happen with other sinks, like MediaRecorder or WebAudio.
I don't fully understand what is being asserted here. A clarification would be welcome.
I also note this interesting bit: (Emphasis mine.)
A late decision seems inherently needed to fix this glitch for the subset of same-type switching.
Does that mean you support dropping the late-decision requirement for non-same-type switching?
This issue was discussed in WebRTC December 12 2023 meeting – 12 December 2023 (Dynamic Switching in Captured Surfaces)
But I am willing to lean in and actually claim it. Yes, developers need the early-decision, because cross-surface-type(!) source-injection is a footgun. Consider the code in this fiddle: https://jsfiddle.net/eladalon/Ly8a3wcs/
I can now substantiate this claim in a more persuasive manner. Try out captured-surface-control.glitch.me using Chrome Beta/Canary. Observe:
Applications built before cross-surface-type source-switching was possible had no reason to expect that getSettings().displaySurface
might be mutable, and they are not robust to these changes.
-- Note: Of course Captured Surface Control is not a standard API. Assume for the sake of argument that it never will be. The whole point here is to show that in the future, we could credibly add APIs that would work for one surface types but not for others, and that unexpected switching would break apps.
I don't fully understand what is being asserted here. A clarification would be welcome.
It's asserting that injection and its alternative have different side-effects, and which ones an app prefers might differ based on what surface the end-user chose to switch to/from (e.g. whether both or neither have audio). E.g.
replaceTrack
~ without waiting for a renegotiation round-trip, but it remains an app decision whether to take advantage of this and how to handle edge-cases (add audio) A late decision seems inherently needed to fix this glitch for the subset of same-type switching.
Does that mean you support dropping the late-decision requirement for non-same-type switching?
I've seen no proposal for how an app might specify its preferences for the different surfaces a user might pick up-front, but am happy to compare complexity of anything presented.
Applications built before cross-surface-type source-switching was possible had no reason to expect that getSettings().displaySurface might be mutable, and they are not robust to these changes.
How are they not robust to these changes? Do you have an example that is not experimental?
It's asserting that injection and its alternative have different side-effects, and which ones an app prefers might differ based on what surface the end-user chose to switch to/from (e.g. whether both or neither have audio).
Thanks, now I understand.
Theoretically speaking - I agree completely. But do we have a concrete example of such an app? Are there any apps that decide whether to use MediaRecorder vs. RTCRtpSender based on whether the user shared a window vs. a screen? I am not aware of such apps, and I'd actually be quite surprised if you could name such an app. All apps I know make the decision - when there even is a decision to be made - before invoking getDisplayMedia(). I think it's important that we solicit actual developer feedback and only introduce complexity that serves genuine needs.
I've seen no proposal for how an app might specify its preferences for the different surfaces a user might pick up-front, but am happy to compare complexity of anything presented.
I don't think that's relevant. I believe the previous paragraph of my present comment explains why.
How are they not robust to these changes? Do you have an example that is not experimental?
Are there any apps that decide whether to use MediaRecorder vs. RTCRtpSender based on whether the user shared a window vs. a screen?
I think there's a misunderstanding. I gave two examples of apps that may need late decision on injection vs new tracks:
I was NOT suggesting a single app might choose between a MediaRecorder or an RTCRtpSender sink. I would indeed struggle to find a concrete example of that. 😉
... only introduce complexity that serves genuine needs
By complexity do you mean functinality? The best API matches the complexity of the functionality exposed. We can observe the natural complexity here by separating concerns:
apps want to learn when the user switches source → they register for the sourceswitch
event
UAs may wish to hold back UX options that might not work → they look for explicit app opt-in through getDisplayMedia({appAssistedSurfaceSwitching: "include", ...}))
Downstream symptoms might dictate when injection vs. new tracks is preferable, which can differ based on what the user chose → event.preventDefault()
= don't inject, I'll handle it
These are mostly orthogonal. I.e. we can imagine apps wanting 1 without 2 or 3, and UAs concern that apps own the user problem is nicely separated from the app's downstream needs, avoiding the fallacy that injection can't or won't work in many cases still.
This offers the most functionality to webpages, including already-shipped functionality (injection).
Compare this to DisplaySurfaceChangeCallback
which ties 1, 2, and 3 together. I.e.
Forcing apps to opt-out of all injection to opt-in to more UA switching no doubt simplifes UA code, by offering less functionality. But less functionality doesn't seem like a user win.
The web developers I have talked to have all preferred the predictability of having a new track for each captured surface over the convenience of the injection model. I don’t think this should be relegated to a secondary use case with extra hoops to jump through.
So let’s see if we can find a way to make both the switch-track model and the injection model easy and straightforward to use, and also provide some more flexibility in how they are applied.
One option could be to provide both of these track-types in parallel:
The API could look something like this:
controller.onnewsource = event => {
video1.srcObject = event.stream; // surface tracks
};
const sessionStream = await getDisplayMedia({controller, /*opt-in*/, ...});
video2.srcObject = sessionStream;
where video1
would be using the switch-track model and video2
would be using the injection model. (The onnewsource
event would be sent for all new surfaces including the initial one)
This API has the following benefits:
preventDefault
.preventDefault
method).What do you think? Could something like this better cover the different usages of the API that we have been considering?
where
video1
would be using the switch-track model andvideo2
would be using the injection model.
I like this idea of exposing both to the application and letting it use the one it prefers. It seems neutral and would let us measure over time whether apps find injection desirable, while remaining backwards compatible.
With preventDefault()
I was hung up on the UA needing to stop one or the other right away, but if we don't need that then it simplifies.
My question would be what are the semantics now of calling video2.srcObject.getVideoTracks()[0].stop()
? Would it also stop video1.srcObject.getVideoTracks()[0]
or not (and vice versa)?
newsource
event.Running with 2 for a bit, maybe we just fire ended
on the other track and call it a special case?
Option 1 makes sense to me, UA will likely optimize the case of no event handler for newsource
The way I conceptualize these options about whether stop
should affect just one or both tracks is as follows:
stop
would only affect the track on which it is called.stop
on one of the tracks would then affect both tracks.If we choose to treat them as clones (option 1), I think that rather than introducing a special case, it’s better to allow the application to choose which tracks to receive through the opt-in, e.g.:
surfaceSwitchingMethods: [“inject”]
to only receive session-tracks.surfaceSwitchingMethods: [“replace”]
to only receive surface-tracks.surfaceSwitchingMethods: [“inject”, “replace”]
to receive both types of tracks.That would avoid creating the extra cloned track in the first place for applications that are only interested in either session tracks or surface tracks. It also does not add any extra burden on application writers since they would need to specify an opt-in anyway.
Option 1 makes sense to me, UA will likely optimize the case of no event handler for
newsource
I don’t think this optimization would work in the other direction, i.e., for applications that are only interested in surface tracks.
Note I inadvertently wrote "hardware light" among my concerns above, but of course this is screen-capture not camera/mic, so the only user-observable side-effect of an unstopped track would be the prolonged appearance of whatever privacy indicators the browser shows for a couple extra seconds until GC happens (e.g. after a user clicks stop).
Option 1 makes sense to me, UA will likely optimize the case of no event handler for
newsource
I don’t think this optimization would work in the other direction, i.e., for applications that are only interested in surface tracks.
That's seems fine, as this optimization would be there to solve today's apps unaware of the newsource
event.
In contrast, apps uninterested in session tracks can simply stop them once they've received new surface tracks:
const sessionStream = await getDisplayMedia({controller, /*opt-in*/, ...});
video.srcObject = sessionStream;
controller.onnewsource = ({stream}) => {
video.srcObject.getTracks().forEach(track => track.stop());
video.srcObject = stream; // surface tracks
};
So there doesn't seem to be much need for new stop semantics, which seems nice.
Having to manually stop tracks is just the type of gotchas that I think we should strive hard to avoid when possible. It’s way too easy for a developer to miss, leading to lingering privacy indicators disconcerting users.
In this case the cost to fix the issue is also next to zero for applications that do not need to use both the injection and switch-track model. (I expect this to be the vast majority of applications.)
Compare:
controller.onnewsource = ({stream}) => {
video.srcObject = stream;
};
await getDisplayMedia({controller, surfaceSwitchingMethods: [“replace”], ...});
to
controller.onnewsource = ({stream}) => {
video.srcObject = stream;
};
const sessionStream = await getDisplayMedia({controller, someOtherOptIn: “include”, ...});
sessionStream.getTracks().forEach(track => track.stop());
The former is both less code and less error-prone than the latter.
With the optimization @youennf proposed, forgetting stop()
seems like an existing problem.
Having apps explicitly stop()
tracks they're done is the web model today, which makes its side-effects well-established, predictable, and pilot errors easy to diagnose and fix.
I'm not convinced introducing custom stopping-policies into the mix simplifies that responsibility.
controller.onnewsource = ({stream}) => { video.srcObject = stream; }; const sessionStream = await getDisplayMedia({controller, someOtherOptIn: “include”, ...}); sessionStream.getTracks().forEach(track => track.stop());
The former is both less code and less error-prone than the latter.
Ah, I missed earlier you said the event would fire for all new surfaces "including the initial one"! Having apps immediately stop tracks from getDisplayMedia() does look weird indeed.
I like the session vs surface behaviors, but why do web developers need to pick between two types of tracks? This seems to artificially put injection off the table on subsequent switches once non-injection is chosen just once, for no apparent or inherent reason.
I'd like to propose a more fluid model where web developers doesn't need to care about this on the initial getDisplayMedia call, and every track remains a candidate for injection:
To inject everything (the UA optimizes stopping tracks surfaced in sourceswitch
):
video.srcObject = await getDisplayMedia({controller, /*opt-in*/, ...});
To never inject:
video.srcObject = await getDisplayMedia({controller, /*opt-in*/, ...});
controller.onsourceswitch = ({stream}) => {
video.srcObject.getTracks().forEach(track => track.stop()); // stop old
video.srcObject = stream;
};
To selectively inject:
video.srcObject = await getDisplayMedia({controller, /*opt-in*/, ...});
controller.onsourceswitch = ({stream}) => {
if (tracksAreCompatible(video.srcObject, streams)) {
stream.getTracks().forEach(track => track.stop()); // stop new
} else {
video.srcObject.getTracks().forEach(track => track.stop()); // stop old
video.srcObject = stream;
};
This issue had an associated resolution in WebRTC April 23 2024 meeting – 23 April 2024 (Captured Surface Switching):
RESOLUTION: more discussion is needed on the lifecyle of surface tracks
The onsourceswitch
or onnewsource
approach seems sufficient to me to support both switch and injection models.
The small feedback I would give is that having these as events might not be great. A callback might be better instead so that there is only one receiver that is responsible to deal with it, for instance closing the new stream/tracks.
Something like captureController.processSourceSwitch(stream => { ... });
or captureController.processSourceSwitch(null);
@jan-ivar:
With the optimization @youennf proposed, forgetting
stop()
seems like an existing problem.Having apps explicitly
stop()
tracks they're done is the web model today, which makes its side-effects well-established, predictable, and pilot errors easy to diagnose and fix.
It makes sense for the application to be responsible to stop a track that it has requested, but in this case the UA throws an extra track on the application that that the application doesn’t want. It seems wrong to me to force application writers to stop this extra track that they never asked for.
I'm not convinced introducing custom stopping-policies into the mix simplifies that responsibility.
There is no new custom stopping policy. The surface track is bound to a specific surface, and it ends when the user switches away from that surface, since no more media will be delivered from that surface.
It’s the same behavior as when the user stops the capture of a surface.
Ah, I missed earlier you said the event would fire for all new surfaces "including the initial one"! Having apps immediately stop tracks from getDisplayMedia() does look weird indeed.
I like the session vs surface behaviors, but why do web developers need to pick between two types of tracks?
If we consider the following two classes of applications:
What I tried to achieve with this proposal was to have tracks bound to individual surfaces for capture-interaction applications while retaining the ease of use of the injection model for capture-agnostic applications. I believe the pure switch-track model is the easiest and least error-prone model for capture-interacting applications.
Overall, I think I’ve seen three different solutions to the stopping problem so far:
I think option 1 and 2 are interesting to explore, while option 3 looks less attractive.
@youennf:
The
onsourceswitch
oronnewsource
approach seems sufficient to me to support both switch and injection models.
I think they can be, but I don’t think we have yet found an API-shape that we all agree on, so that’s why I explore other options.
The small feedback I would give is that having these as events might not be great. A callback might be better instead so that there is only one receiver that is responsible to deal with it, for instance closing the new stream/tracks.
Something like
captureController.processSourceSwitch(stream => { ... });
orcaptureController.processSourceSwitch(null);
I’m fine with this.
The switch and injection models are roughly equivalent to me for applications that are ok reacting synchronously to a switch change.
When the reaction is asynchronous (say applying region capture), I am not sure one of the presented model is more suited (VideoTrackGenerator to the rescue maybe).
Wrt option 2 and 3, they are not mutually exclusive with the callback approach:
captureController.processSourceSwitch(callback)
) and extend the API when we are ready.Isn't option2 somehow equivalent to one of these option 2 extensions ?
captureController.processSourceSwitch(callback, { mode: 'stop-previous-tracks' })
captureController.processSourceSwitch(stream => { ...; return 'stop-previous-tracks'; })
// synchronous decision and video frames flowingcaptureController.processSourceSwitch(async stream => { await...; return 'stop-previous-tracks'; })
// asynchronous decision and video frames flowing@youennf:
...
- We start simple (
captureController.processSourceSwitch(callback)
) and extend the API when we are ready.Isn't option2 somehow equivalent to one of these option 2 extensions ?
captureController.processSourceSwitch(callback, { mode: 'stop-previous-tracks' })
In the case of requesting surface tracks, it's equivalent to this one.
So, if I understand you correctly, you think we could start with something like the following?
An application can register a callback and specify the stop-previous-tracks-mode:
captureController.processSourceSwitch(callback, { mode: 'stop-previous-tracks' });
getDisplayMedia({captureController, …});
And then, when a user selects another surface, the UA will:
This sounds good to me.
I am hoping we can quickly reach consensus on captureController.processSourceSwitch(callback)
without any option for now. That would allow us to define the model and this method very quickly in the spec/UAs to help web developers.
I am not sure we have reached consensus yet on which options to expose and how to expose them. Hence why I am proposing this two steps approach, where we know we can easily go from step 1 (no options) to step 2 (with options). The idea would be to continue step 2 discussions while doing step1 spec/implementation work.
Sounds like a plan!
I uploaded a PR last year along these lines: https://github.com/w3c/mediacapture-screen-share/pull/289
(I called the method setDisplaySurfaceChangeCallback
, and I do think it is more of a setter than a process-method, but I'm open to discuss other names)
Please take a look!
A callback might be better instead so that there is only one receiver that is responsible to ... closing the new ...tracks.
It makes sense for the application to be responsible to stop a track that it has requested, but in this case the UA throws an extra track on the application that that the application doesn’t want. It seems wrong to me to force application writers to stop this extra track that they never asked for.
Forgetting stop seems a problem in all the proposals.
What if the UA stopped tracks synchronously after the callback/event-handler, requiring JS that wants to use a track to clone it?
@youennf is the stop-problem the only issue driving you to prefer callbacks over events?
My thinking is a sourceswitch
event that fires whenever the user switches source (with no requirement to stop tracks) might be useful even to capture-agnostic applications. E.g. to disambiguate configurationchange events fired on its tracks.
Something like
captureController.processSourceSwitch(stream => { ... });
orcaptureController.processSourceSwitch(null);
I’m fine with this.
Note the session vs surface tracks distinction won't hold here. E.g.
video.srcObject = (await new Promise(r => controller.processSourceSwitch(r)).stream;
controller.processSourceSwitch(null);
// the tracks in video.srcObject are now surface tracks yet subject to injection
How is the surface/session distinction meaningful to web developers?
While writing the above code example, I found no way to await injection, which felt frustrating. Contrast with:
video.srcObject = (await new Promise(r => controller.onsourceswitch = r)).stream.clone();
// the tracks in video.srcObject are now surface tracks
await new Promise(r => controller.onsourceswitch = r);
// the tracks in video.srcObject have been injected
My thinking is a
sourceswitch
event that fires whenever the user switches source (with no requirement to stop tracks)
The callback approach allows this as well. I was not clear about it previously in this thread (sorry about that), setting the callback would not be a signal for the UA to go to the switch mode and stop the previous tracks.
Instead, we stick with the injection model for old tracks. The web page can stop the old tracks anyway. I am ok adding an option so that the web page tells the UA to stop the tracks (hence the various proposals I made on top of the callback). We need though language that instructs that media is not flowing in the old tracks until the callback is executed.
Having a callback to deliver the stream is better since there is one place where you decide what to do with the new tracks (clone it, stop it...). And the spec can be made clear that MediaStreamTracks are not created if the callback is not set. This is more difficult with events. And I do not really see a case for multiple event listeners for this switch case (web devs already have configuration change anyway).
The web page can stop the old tracks anyway
What enforces that the website can't keep both the old live injected track and the live new track? We need to specify this implicit action at a distance.
Having a callback to deliver the stream is better since there is one place where you decide what to do with the new tracks (clone it, stop it...)
If this means there's one place where you decide what happens with the old tracks (enforced by the aforementioned action at a distance), then I agree that might be a good reason for a callback.
Can we make it a settable attribute at least?
What enforces that the website can't keep both the old live injected track and the live new track? We need to specify this implicit action at a distance.
I do not see any implicit action at a distance, the website can keep both Is there an issue with that?
If this means there's one place where you decide what happens with the old tracks
In my mind, the default behavior (whether setting the callback or not) is that no track is being stopped by UA, the web page can deal with it by itself.
We can enrich the callback to make the UA stop the previous tracks, for instance:
captureController.setDisplaySurfaceChangeCallback(stream => { ... }, { mode: 'stop-previous-tracks' })
captureController.setDisplaySurfaceChangeCallback(stream => { ...; return 'stop-previous-tracks'; })
captureController.setDisplaySurfaceChangeCallback(stream => { ...; captureController.stopPreviousTracks();... })
captureController.setDisplaySurfaceChangeCallback(async stream => { await...; captureController.stopPreviousTracks();... });
It is a bit less straightforward to extend things with an event. And again, it is not really compelling to have several event listeners sharing the responsibility to stop the old tracks (or the new tracks).
Also, I could see a UA tailoring its UI based on the callback being registered (not showing the sharing audio check box if no audio was shared before the surface switching for instance). This is sort of similar to MediaSession going with callbacks as action handlers.
Can we make it a settable attribute at least?
Ah, good point, I guess this would disallow option 1 above.
(for the record - this was discussed in the joint SCCWG/WebRTC meeting last week)
And I do not really see a case for multiple event listeners for this switch case
A singular callback assumes a single downstream consumer. An app may have multiple consumers of an active screen-capture, e.g. a transmitter, a preview, and a recorder, each with distinct downstream needs.
Tracks can be cloned, but a CaptureController cannot. So this becomes a choke point. We don't want different parts of an app competing to set the same callback and overwrite each other.
The web platform tries hard to avoid single-consumer APIs. See § 7.3. Don’t invent your own event listener-like infrastructure, and requestVideoFrameCallback.
I think we need a good reason to deviate from these principles.
(web devs already have configuration change anyway).
Those are per-track and cannot tell you whether the source changed or e.g. was just resized.
And the spec can be made clear that MediaStreamTracks are not created if the callback is not set. This is more difficult with events.
This seems like a marginal optimization compared to a such a significant user action.
This is sort of similar to MediaSession going with callbacks as action handlers.
That's a fairly recent API with its own flaws. But it has a good reason: Many of its actions rely on the website to maintain a singular state. What's our reason?
Also, I could see a UA tailoring its UI based on the callback being registered (not showing the sharing audio check box if no audio was shared before the surface switching for instance).
We've gone around a few times on this point. Yes the absence of a callback might preclude the app handling audio, but the presence of a callback does not guarantee it.
But § 7.3. specifically mentions this point: for "an API which allows authors to start ... a process which generates notifications, use the existing event infrastructure"
Those are per-track and cannot tell you whether the source changed or e.g. was just resized.
Just check for deviceId
in track.getSettings(), no need for using source switch.
Source switch is about deciding whether to use the old tracks or the new tracks.
That's a fairly recent API with its own flaws.
I don't see how this particular flaw applies here.
MediaStreamTrack events are where you distribute the info. CaptureController is a single place for mission critical information (running the callback may actually trigger a freeze of video frame generation).
Discussed at editor's meeting and we will try to converge via a design document.
Another proposal to consider, maybe it could help convergel:
configurationchange
event for apps to decide whether to continue processing or not. This means that when there is a source switch, video frame sending to sinks will be suspended until configurationchange
event listeners are called (on a per track basis). This ensures the injection model works.Those are per-track and cannot tell you whether the source changed or e.g. was just resized.
Just check for
deviceId
in track.getSettings()
The spec says: "deviceIds are not exposed.". It's not listed in § 5.4 Constrainable Properties for Captured Display Surfaces.
CaptureController is a single place for mission critical information (running the callback may actually trigger a freeze of video frame generation).
Why is that critical? This is the kind of action at a (maybe not so much) distance we should document. This might justify a callback.
The spec says: "deviceIds are not exposed."
Chrome and Safari are exposing deviceIds.
Wrt callback vs. event, let's rediscuss this when we know what signals we want to expose. @tovepet is planning to create a design document we can all participate in to try reaching consensus on the underlying model.
Chrome and Safari are exposing deviceIds.
I've filed crbug 372252497 and webkit 281077.
Agree callback vs. event seems secondary.
The main question seems to be over allowing late decision vs. limiting injection to tracks returned from getDisplayMedia().
What's the benefit of exposing surface tracks rather than new session tracks in the callback/event?
What's the benefit of exposing surface tracks rather than new session tracks in the callback/event?
I am a bit confused about the purpose of "new session tracks". Who would need them? The entire idea of a "session track" is that it follows the session wherever it goes, whatever the captured surface is, across user-initiated changes. If a developer needs multiple such session tracks, can't they just clone the original ones?
I've filed crbug 372252497 and webkit 281077.
It seems useful information to provide, why not instead updating the spec?
Thinking a bit more, I am not sure the separation between session tracks
and surface tracks
is helping shape this API.
Let's look at the following two scenarios:
User agent U1 is exposing a switch surface UX and user clicks on it. User is expecting that the new surface content will be rendered where the past track content was rendered. It seems reasonable that the same track exposes the media content and that no new track is exposed: session track model seems well suited.
User agent U2 is exposing an add/remove surface UX. User first adds a surface B and then removes the original surface A. I would think user is expecting the app to react to the new track somehow by using a new video element for rendering. This seems inline with the user agent exposing a new track for B and ending the original track A: surface track model seems well suited.
Given this, and given UX in that area is relatively new, I am not sure we can design an API that specifies a particular flow. Having an API that exposes new tracks and having a requirement that video frames of the switching surface do not get provided to sinks until some event/callback actually runs might be good enough for now. Plus some guidelines...
That said, AFAIK, the only thing UAs are doing right now is scenario 1 above. And this is what the initial message of this issue is describing.
Based on this, I would suggest trying to fix this and only this for now by specifying the requirement that video frames of the switching surface do not get provided to sinks until some event/callback actually runs. I would tend to use the configuration change
/deviceId
combo for that as it does not require new API surface.
I have created a design document to provide for a more structured discussion of the different proposals (view-only): https://docs.google.com/document/d/16CUOJeuXimNPi4kZHOS9rF-WhMuVvOqOg9P--Dvqi_w/edit?usp=sharing
Edit permissions will be granted to members of the WebRTC working group upon request.
What's the benefit of exposing surface tracks rather than new session tracks in the callback/event?
I am a bit confused about the purpose of "new session tracks". Who would need them?
To avoid confusion, I've defined a hybrid track to clarify what I mean. But it's really the surface track I'm questioning. [Edit 2: undid my edit to capture subtle differences]
I've written up the model I have in mind as the late decision model. PTAL (edit: links fixed)
Jan-Ivar, I am unsure what your current position is, given the edits. Do you withdraw your question about "new session tracks"? My position is that there is no benefit to including new objects in a fired event, if these objects are identical to objects we had before. (That is - new session tracks are identical to the originals, and so "new" ones are useless.)
Sorry for editing multiple times. Calling mine a hybrid track now, to distinguish it. It wasn't clear from the session track definition that its feature set would be limited, which would be backwards incompatible.
Do I understand correctly a driving goal of the session/surface split is to maintain the subclassing of MST?
It wasn't clear from the session track definition that its feature set would be limited, which would be backwards incompatible.
That's one vision of it, out of two - either (1) a normal, fully-fledged MediaStreamTrack, or (2) a reduced-feature-set MediaStreamTrack. But which is used is a secondary matter.
The core offering of a session track, as I understand @tovepet here, is that it addresses your (Jan-Ivar's) expressed desire, to be able to seamlessly transition between two models (injection, switch-track).
I believe it is objectively true, that Tove's model is more flexible.
I am a bit lost in what we are trying to solve here. Can we add a scope/use case section to the design document?
I am a bit lost in what we are trying to solve here. Can we add a scope/use case section to the design document?
We should also rely a bit more on an exploration of the uses cases, which I see only includes a single use case atm. I have taken the liberty to add 4 more.
I have added a Scope section with the following bullets that I believe we want to solve in the first step:
Is this inline with what the rest of you think?
Both Chrome and Safari now allow the user to change what surface is captured.
That's obviously great stuff. Can we make it better still?
So I propose that we add two things:
enabled
back to true.)Possibly the two can be combined, by specifying that setting an event handler signals that the pausing behavior is desired (@alvestrand's idea).
Another natural extension of this idea is to also apply it when a captured tab undergoes cross-origin navigation of the top-level document. When that happens, some applications might wish to stop capture momentarily and (in-content) prompt the user - "do you still want to keep sharing?"
Relevant previous discussion here.