Garbage collection of live tracks

eladalon1983 commented 2 years ago

Consider this snippet:

async function ClumsyFunc(constraints) {
  const stream = await navigator.mediaDevices.getUserMedia(constraints);
  if (SomeCondition()) {
    HandOff(stream);
  }
  // Oops! Dropped last reference on the floor. What a mess...
}

What happens now? Should the stream and its tracks be eligible for garbage collection? If so, there are user-facing side-effects.

Camera/mic hardware indicator goes off.
User agent indicators disappear.
OS-level indicators disappear.

Note that getDisplayMedia has similar issues. In some implementations (Chromium), an application could even heuristically observed GC this way, because tab-sharing adds an infobar that affects the viewport's size, and when it disappears the viewport changes back.

So should GC of live tracks be disallowed? Allowed? Encouraged?

Arguments for allowing GC:

Reduce CPU and battery consumption.
Free up the hardware, for use by other apps with a different configuration.
Inform the user of reality - nobody is reading their mic/camera/screen anymore.

Argument for disallowing GC:

The app should have stopped the track first. Relying on GC creates an unpredictable user experience. By suppressing GC, we'll expose the bug and motivate the app devs to fix it, which is better for users - our top constituency.
Garbage-collecting live tracks allows apps to heuristically observe GC (see above).

youennf commented 2 years ago

UAs probably GC tracks, it makes sense to me to keep it that way. It would be nice to signal to web developers that they should stop tracks explicitly, particularly capture tracks. We could do two things:

Adding console log warning when GCing a capture track
Note in the spec to web developers that they should stop capture tracks explicitly instead of relying on GC. See for instance what web codes is doing: NOTE: Authors are encouraged to call close() on output VideoFrames immediately when frames are no longer needed. The underlying media resources are owned by the VideoDecoder and failing to release them (or waiting for garbage collection) can cause decoding to stall.

eladalon1983 commented 2 years ago

I am not opposed to adding a spec note, but I don't think it would be impactful enough for us to declare "mission accomplished."

Is there anyone knowledgeable about GC that we could pull in to voice expert opinion?

jan-ivar commented 1 year ago

GC of live tracks is allowed, and should not be disallowed IMHO. Blocking cleanup of resources on JS calling an API correctly is something we actively work to avoid. We've removed APIs in the past over this https://github.com/w3c/mediacapture-main/issues/404

If individual user agents are leaking GC detection, that's an implementation bug IMHO, and up to that user agent to mitigate.

eladalon1983 commented 1 year ago

If individual user agents are leaking GC detection, that's an implementation bug IMHO, and up to that user agent to mitigate.

Is there any user agent, real or theoretical, that could avoid leaking GC to the user? Per the spec, the user agent is supposed to have persistent privacy indicators that the user can easily observe.

youennf commented 1 year ago

GC of live tracks is allowed, and should not be disallowed IMHO.

Other specs tend to provide rules like:

If a track is live and there is a registered ended event listener, the track should not be GCed.
If a live track is GCed, the track source should be notified that the track has ended. The spec is probably underspecified here and it would be good to add such rules.

the user agent is supposed to have persistent privacy indicators that the user can easily observe

I think we do not want to allow the web page to observe GC. On the other hand, the user is not really a concern here so privacy indicators are probably out of scope of GC analysis.

jan-ivar commented 1 year ago

Is there any user agent, real or theoretical, that could avoid leaking GC to the user?

Exposing GC to a web page is a well-known concern, but exposing it to the end-user isn't AFAIK.

eladalon1983 commented 1 year ago

I've experienced tracks stopping on me at unexpected moments due to GC in toy apps I'd written to test new API surfaces I'd implemented. It felt disconcerting. In my dev setup, I could of course easily fix it by storing the reference to the track in a global variable I called keepAlive; that's beside the point. My point is that the momentary jolt of surprise helped me channel a hypothetical real user. I suspect that for a end user interacting with a real app, this would seem very strange and disconcerting, on two accounts.

First, that the mic/camera/screen-capture session persisted for longer than expected.
Second, that it came to an abrupt end. The user is likely to blame whatever arbitrary action they took last. Or he'd think something is buggy, but not know what; the app, the browser, or even the hardware.

jan-ivar commented 1 year ago

That the camera stops and its light goes away when it is no longer used, is a feature of the browser, and a bug in the app. it seems prudent to free up resources and to alert the user they are no longer being observed.

I'd argue that what's disconcerting to users (as opposed to web developers) is not that tracks end, but that camera lights stay on longer than expected. https://github.com/GoogleChrome/developer.chrome.com/issues/4894 seems to confirm this.

Marking this as editorial to add the note mentioned in https://github.com/w3c/mediacapture-main/issues/910#issuecomment-1285733588. If anyone disagrees, please comment.

eladalon1983 commented 1 year ago

That the camera stops and its light goes away when it is no longer used, is a feature of the browser, and a bug in the app.

This seems to agree with my earlier statement that: "By suppressing GC, we'll expose the bug and motivate the app devs to fix it, which is better for users - our top constituency."

(Emphasis mine on both quotes.)

What we seem to disagree on is the relative value of motivating an earlier fix vs. freeing up resources.

I'd argue that what's disconcerting to users (as opposed to web developers) is not that tracks end, but that camera lights stay on longer than expected.

We agree here too.

If anyone disagrees, please comment.

I disagree. This is issue is proposing a normative change - it is not editorial.

To stress again an argument I might have failed to highlight enough before - GC is unpredictable behavior, and users are (rightfully) distressed by unpredictable behavior. GC should not be exposed to the user through its side-effects.

bradisbell commented 1 year ago

A clarifying question...

In the original code snippet, as long as HandOff(stream) is doing something with that stream that created a reference to it, or that stream's child objects such as tracks, stream would not be GC'ed in any circumstance today. Is that correct? If that is accurate, then I'd agree with @jan-ivar on this. If the application has no way to actually use the stream anyway, as all references to it are gone, it makes sense to stop the stream and free up the capture resource.

However, if stream is at risk of being GC'ed simply because its first reference here is gone, then this would be a problem. We used to experience bugs like this with the Web Audio API, where the AudioContext would disappear unless we tacked it to window or some other global, which was unexpected. In those cases, we often had an AudioContext that had associated nodes. The AudioContext itself wasn't referenced, but nodes attached to it were. I worry we'd have a problem like this again, but with MediaStream if we GC it even though it has other references or usages.

eladalon1983 commented 1 year ago

I don't think that's the point, @bradisbell. We all agree that it's a bug in the application, and that the stream and track are eligible to be garbage collected. My argument here is that we have good reasons to consider specifying that GC should be suppressed in this particular case - please see my previous comments for the rationale.

youennf commented 1 year ago

This seems to agree with my earlier statement that: "By suppressing GC, we'll expose the bug and motivate the app devs to fix it, which is better for users - our top constituency."

We do care a lot about the user using the UA at the time the track may be GCed. For that specific user, releasing the camera sooner is a good thing.

We also care about hypothetical future users, or web developers testing their apps, but a little bit less that the previous user. Also, web developers had plenty of time to get used to that particular safety belt, removing it now might cause some harm for end users.

eladalon1983 commented 1 year ago

Note that the camera, although "in use", sends frames into the void, and is not hurting the user's privacy. Possibly it's detrimental to their battery. My subjective opinion is that unpredictability (drawback) offsets battery (gain).

jan-ivar commented 1 year ago

eladalon1983 commented 1 year ago

Ah, the classic meme of man concerned by unpredictable GC.jpg So clearly we're in agreement, and you intend to send a PR for me to review by EoW. Glad to hear it.

Either way - the issue is not editorial, so I am removing that label. (And it wouldn't be editorial even if we end up closing the issue with no action.)

guidou commented 1 year ago

I also agree that we shouldn't disallow GC of a live track that is not referenced anywhere. So far, the argument seems to be that buggy applications behave in an unpredictable way (the unpredictability being that an unused camera/mic closes at an arbitrary point in time when GC runs). I don't see any problem with that. I think leaving the camera/microphone ON indefinitely is neither desirable nor expected by users.

eladalon1983 commented 1 year ago

@jan-ivar has often complained about Chrome not (yet) implementing the requirement of transient activation for getDisplayMedia(). IIRC, he mentioned that some Web developers only test their applications on a few browsers, and then those applications break on other browsers; specifically, such applications broke on Firefox, which does require transient activation for gDM, as per the spec.

The same reasoning applies here. If a hypothetical browser performs GC once a second, Web developers will not realize their bug in letting go of tracks without stopping them, and the app-level bug they produce will only manifest on other browsers.

guidou commented 1 year ago

I don't think it's the same reasoning. The getDisplayMedia issue described in the example is a problem caused by a buggy browser, not a buggy application. The real solution is to fix the browser. In the GC case, the problem is caused by a buggy application. The solution is to fix the application.

eladalon1983 commented 1 year ago

There are similarities and there are differences. It is the "same reasoning" if you compare the right metric. Here I am referring to Jan-Ivar's concerns that developers only check on some browsers and their apps break on other browsers. The case where apps break on Firefox because the getDisplayMedia implementation is different from Chrome is instructive. It teaches us that if Web developers were to test their app on a hypothetical browser1 that always garbage-collects promptly, they'd not discover that they neglected to stop the tracks, and that would behave undesirably only on browser2. That would create an issue similar to that which gDM-without-user-gesture causes for Firefox.

w3c / mediacapture-main

Garbage collection of live tracks #910