w3c / mediacapture-screen-share-extensions

Other
1 stars 0 forks source link

[Capture control] Is gesture forwarding tied to capture controller or to MediaStreamTrack #20

Open youennf opened 1 week ago

youennf commented 1 week ago

CaptureController lives where capture was initiated. MediaStreamTrack on the other hand can be transferred and lives where it is being rendered. This makes it potentially possible for CaptureController and the getDisplayMedia track to live in different contexts.

Given gesture forwarding is tied to the track's preview, it seems it is more tied to MediaStreamTrack/HTMLVideoElement than CaptureController.

The question is then whether API should be tied to CaptureController or to HTML elements/MediaStreamTrack. I would then to favour the latter.

eladalon1983 commented 1 week ago

I think the right mental model is that we are controlling the captured surface, and CaptureController is the proxy for that concept (in all APIs we introduce), whereas MediaStreamTrack is just a handle to get frames (similarly). Those frames might not even be coming directly from the captured surface; they might be going through some transformation first, such as getting annotated, cropped or adjusted for better contrast.

Is there genuine Web developer interest in displaying the video element somewhere other than in the document that first called getDisplayMedia()? I am not aware of such a need, so I'd rather not design for it. (Unless the current design actively prevented such later extensions, of course. I don't think this is the case, though.)

jan-ivar commented 14 hours ago

My mental model is this is about enabling user controls, not app controls. This suggests it might be logical to put the API on the DOM objects the user interacts with.

This is why I find @youennf's API in https://github.com/w3c/mediacapture-screen-share-extensions/issues/13#issuecomment-2427791931 appealing. E.g.:

videoElement.enableGestureForwarding = true;
div.enableGestureForwarding = true; // if on top
canvas.enableGestureForwarding = true; // recently drawn to with video

MediaStreamTracks can be cloned and transferred to workers, where gesture forwarding doesn't make sense, so I don't think that's the right place.

OTOH, CaptureController.forwardWheel(x) only supports one x, and x = null is how to stop fowarding (a bit surprising that).

It may be uncommon to have two preview elements, but if a website wants it, as a user I'd expect to be able to scroll both.

This to me suggests an element API.

eladalon1983 commented 2 hours ago
div.enableGestureForwarding = true; // if on top

The reasons to use an async API have been previously presented here, and brought up in multiple other threads (example).

Putting the async question aside for a moment - assume for the sake of argument that we reshape this proposal to be div.setGestureForwarding() and it returns a promise - I'd still oppose this proposal, because it makes unnecessary and unhelpful assumptions about the target element:

  1. It assumes a single video element with a single capture. (If not, which one is being forwarded?)
  2. It assumes that the element is the owner of video element. (At first glance, appears to provide security guarantees; in practice, does not.)

I also think it's poor choice of API to expose on HTMLElement or anything similarly high-level anything so capture-specific. I don't think this is good API design.

OTOH, CaptureController.forwardWheel(x) only supports one x,

We have discussed the possibility of CaptureController.forwardGestures(element, gesturesDict), which would have allowed forwarding from multiple elements. But thinking of this some more, IMHO, it is preferable to only allow forwarding from a single element, unless Web developers indicate a clear use that benefits users. The cost to implementers, the (slight) incrase in API complexity for developers, and the (very slight) risk of abuse, all require something to counterbalance them; but so far, such a need has not been articulated.

It may be uncommon to have two preview elements, but if a website wants it, as a user I'd expect to be able to scroll both. [Emphasis mine - Elad]

If.

and x = null is how to stop fowarding (a bit surprising that).

Why is that surprising? To name one precedent - mst.cropTo(null).