w3c / mediacapture-surface-control

Web API allowing capturing applications limited control over captured surfaces.
https://w3c.github.io/mediacapture-surface-control/
Other
9 stars 1 forks source link

Consider dropping permission and making zoomLevel an attribute #27

Closed jan-ivar closed 3 weeks ago

jan-ivar commented 4 weeks ago

Something like this:

  const zoomLevels = CaptureController.getZoomLevels();
  const controller = new CaptureController();
  const zoomIncreaseButton = document.getElementById('zoomIncreaseButton');
  zoomIncreaseButton.onclick = () => {
    const index = zoomLevels.indexOf(controller.zoomLevel);
    const newZoomLevel = levels[Math.min(index + 1, zoomLevels.length - 1)];
    controller.zoomLevel = newZoomLevel;
  });

The webidl right now says:

  long getZoomLevel();
  Promise<undefined> setZoomLevel(long zoomLevel);

The reason seems to be the setZoomLevel algorithm includes requesting permission, which should be outdated with https://github.com/screen-share/captured-surface-control/issues/14.

eladalon1983 commented 4 weeks ago

The reason seems to be the setZoomLevel algorithm includes requesting permission, which should be outdated with https://github.com/screen-share/captured-surface-control/issues/14.

It is not clear to me that w3c/mediacapture-surface-control#48 could obviate the need for a permission prompt. For simplicity, suppose we only supported 'click' events on an HTMLButtonElement. There is still no way for the user agent to know what the label of that button was, and how the user understood that label.

jan-ivar commented 4 weeks ago

For simplicity, suppose we only supported 'click' events on an HTMLButtonElement. There is still no way for the user agent to know what the label of that button was, and how the user understood that label.

That seems fine to me. A lying button the presenter can click or wheel over to mess up their captured tab's zoom level seems more like a bug than a threat vector. As long as the user is driving, it addresses my remote control concern.

If there are other threat vectors to consider please list them under § 6. Privacy and Security Considerations. That's where I'd expect to find the rationale for requiring permission.

Screen-capture is already behind permission, and might suffice here. The risk/reward of a separate prompt seems low. @youennf WDYT?

eladalon1983 commented 4 weeks ago

The user might simply be unaware that the Web platform now has this capability, and the capturing website might do a poor job of informing the user. The permission prompt dispels those concerns for me.

However, if Firefox does not want a permission prompt in this case...

I don't see an interop risk here if we go that way. Suppose Chrome has a permission prompt that Firefox skips - no app would break. So all of these approaches seem safe to me, while allowing us the flexibility to remove the prompt at a later time without breaking interop or backwards compatibility with earlier versions of the browser.

jan-ivar commented 3 weeks ago

The user might simply be unaware that the Web platform now has this capability,

Using permission prompts to inform users of new functionality seems overkill.

Let's look to the guidelines for help.

§ 2.10. Require user activation for powerful APIs says "user activation ... is not always sufficient to protect users from invasive behaviours, and seeking meaningful consent is also important."

"not always" = sometimes. So there's a chance we're good, since we implement something even stronger than consuming activation here. The question is:

Is meaningful consent required here? § 1.4. Ask users for meaningful consent says: "If a useful feature has the potential to cause harm to users, ... make sure ... they can refuse consent effectively."

Do we feel buttons that (when interacted with) can scroll down or zoom a captured tab all the way out reaches a level of harm? Possibly, since this might reveal more webpage information than the user expected.

But it also says: "If a feature is powerful enough to require user consent, but it’s impossible to explain to a typical user what they are consenting to, that’s a signal that you may need to reconsider the design of the feature."

I think we should work on mitigating these risks directly. For example: limiting the number of zoom steps per click, or initial scroll speed. Then we won't need a separate prompt.

I also don't think this level or permission granularity is helpful. When would this prompt appear? When I click a button to zoom out? Maybe that's the right decision for what I'm sharing today, but not tomorrow? — A better design here seems a checkbox in the existing screen-sharing picker that the UA is free to remember between sessions or not, and then simply add prose to allow the UA to reject with NotAllowedError for this capture.

I see no need to pull in the full permission machinery here with delegated permissions, query and the like.

eladalon1983 commented 3 weeks ago

But it also says: "If a feature is powerful enough to require user consent, but it’s impossible to explain to a typical user what they are consenting to, that’s a signal that you may need to reconsider the design of the feature."

The problem I see is that the typical user already understands screen-sharing, and they understand it to mean something quite specific. We would be retroactively expanding this consent from "share visible pixels" to "sharing potentially any pixels." We believe that this is fine, but it doesn't hurt to be conservative.

An initial implementation with a prompt, and then an inspection of the prompt's grant-rates and other adjacent metrics, would help us make an informed decision here.

(Maybe Chrome will even be able to share findings here with other browsers? I don't have the authority to approve it, but I will be requesting to share this info when the time comes.)

I think we should work on mitigating these risks directly. For example: limiting the number of zoom steps per click, or initial scroll speed. Then we won't need a separate prompt.

These would-be mitigations limit legitimate use without providing any security benefits, afaict. I am perfectly fine calling out that user agents MAY employ them, but I would not mandate them, and I would not rely on them.

A better design here seems a checkbox in the existing screen-sharing picker

We do not plan any more checkboxes in the screen-sharing picker. They have negative value, and the more we have - the worse it gets. When they are default-checked, they don't indicate any real user consent; when default-unchecked, users simply don't find them. This has been our position on such proposals for quite some time, and I don't foresee it changing, as we have not received new information.

I see no need to pull in the full permission machinery here with delegated permissions, query and the like.

User agents can infer permission from any heuristic they choose, including auto-allowing it for all captures. And if you want this explicitly called out in the spec, I'm happy to do so.


Long-term, I would love to drop the prompt. But it's premature. It risks discovering empirical data that would require a breaking change to the API to reintroduce permission prompts. Let's start out with an API shape that allows a prompt; it would be trivial to remove this later.


Another argument, by the way... It's not just about malicious applications. It's also about a user feeling confidence, privacy and control. When I use the Web platform, I want to know that I am interacting with limited applications that can only do what I approve them to do. Personally, if I were to discover the Web app could do more than I knew, I'd feel "yuck" - even if it were not a malicious application, and even if it did not hurt me in that particular instance.

eladalon1983 commented 3 weeks ago

Having thought about a zoomLevel attribute some more, I think it would be a bad idea even if we had no permission policy, because of asynchronicity.

Consider the following:

console.log(`${controller.getZoomLevel()}`);  // Assume 50 is printed.
/*await*/ controller.setZoomLevel(75);
console.log(`${controller.getZoomLevel()}`);  // Still 50; makes sense.

Contrast this with:

console.log(`${controller.getZoomLevel()}`);  // Assume 50 is printed.
await controller.setZoomLevel(75);
console.log(`${controller.getZoomLevel()}`);  // Likely[*] 75; makes sense.

It is perfectly clear to the Web developer that the async operation does not take effect immediately.

Because of run-to-completion, we can even expect the following to work correctly:

/*await*/ controller.setZoomLevel(75);
console.log(`Changed from ${controller.getZoomLevel()} to 75.`);

But with a zoomLevel attribute, what would happen here?

console.log(`Before: ${controller.zoomLevel}`);  // Assume 50.
controller.zoomLevel = 75;
console.log(`After: ${controller.zoomLevel}`);  // ???

Much more idiomatic to just have a getter and a setter. The value returned by getZoomLevel() only changes after some behind-the-scenes IPC informs the capturing application's process that the zoom-level changed. (Likely the same IPC driving the zoom-level event handler.)


As to the [*] part - to be 100% accurate, the promise resolves when the action is permitted, and listening to oncapturedzoomlevelchange is the right way to know when the change took effect. This is relegated to a footnote because it would distract from the main points, which are - the operation is asynchronous by nature, and using an async API indicates that, whereas using a single value for both "current value" and "desired value" would mislead developers.

jan-ivar commented 3 weeks ago

Moved to https://github.com/w3c/mediacapture-surface-control/issues/48.

eladalon1983 commented 3 weeks ago

https://github.com/w3c/mediacapture-surface-control/issues/48 is not exactly the same topic. So I guess I can understand from the closure of this current issue, that you have found as convincing my arguments about the inviability of a single attribute that covers both current-value and desired-value. Would be nice to hear as much explicitly.