Permission model for Machine Learning APIs

dontcallmedom commented 3 years ago

In his talk, @cynthia raises the possible desirability of requiring permission before making ML APIs to developers:

Additionally while it is possible to do a lot of this without a permission as of today, through capabilities already available in the platform such as WebGL and WebAssembly, at the point of standardization as a platform API, we would like to see this behind the permission if possible. The reasons for this is not only due to the potential privacy implications, but also due to the power requirements that these APIs may bring to the table. This can have negative effects on the battery life for users who get out of a single charge. So we believe that the users should have a choice to reject if they are in a situation where they would want to be conservative about power usage.

I don't see any existing discussion in WebNN on permissions.

There are a number of usual questions when it comes to managing permissions in Web APIs:

asking for user permission (which is tied to whether the permission can be formulated in a way that users can interpret)
tying into the permission model exposed to developers via the Permissions API (which is tied but distinct from the above - not all permissions need to be obtained via a prompt)
tying into the permission model for embedded content via Permissions Policy - which enables a Web page to allow or forbid whether an embedded iframe can use the said feature

As @cynthia points out, given that a number of the features exposed either by WebNN or indirectly by the Model Loader API can be at least partially polyfilled with WebAssembly, WebGL, WebGPU, there would be a need to at least align with how these other APIs manage that question.

Given that WebAssembly for instance seems to have broad use for user-hostile purposes there may be indeed value in bringing that community in for this conversation. I haven't found direct discussion of this issue in the WebAssembly repos, although there is a somewhat old discussion of the intersection with Content-Security Policy. @lukewagner @ericprud , has this topic been discussed in WebAssembly land?

With all that said, I believe there may be other approaches worth exploring to address (at least partially) the risks that @cynthia raises:

an awareness indicator that these power-intensive APIs are being used (as is being done e.g. for mic/cameras)
linking access to these APIs to the battery status (as exposed e.g. in the Battery Status API)
tying this to Page Visibility; the Chrome team has been exploring the topic of how to handle throttling in background tabs which is probably worth taking into account in this context - I wonder if this is something @altimin might comment on

anssiko commented 3 years ago

Permission model was a key design consideration and discussion point for the Generic Sensor API, the next-gen framework for exposing sensors to the web. Let me share my learnings from that journey since I think we can draw some parallels to ML APIs.

I'll cc @rwaldron, since this API has its roots in Johnny-Five, The Node.js platform for robotics and IoT. It should be noted Node.js has a different permission model from browsers, so the permission model part of the Generic Sensor API is a clean-slate design.

The Generic Sensor API effort reached consensus on an API design where each low-level sensor has its own permission token (e.g. "accelerometer", "gyroscope", "magnetometer") as opposed to a high-level "blanket" permission (e.g. "motion-sensor"). High level sensors in fact decompose into multiple low-level sensors. This approach exposes the low-level permissions required to web developers through the programmatic Permissions API, but does not prescribe the prose that browsers surface to the user in the context of user-facing permission affordances ("Do you want to allow X to access Y?"). Browsers should talk to end users in a language they understand. In relation to this point, W3C TAG outlines how to ask users for meaningful consent when appropriate in its Web Platform Design Principles. @cynthia as the editor, is well aware of this work.

To draw parallels to the ML APIs, maybe the various ML ops could be split into permission buckets that are attached with their own permission tokens (or descriptors in spec parlance)? This would allow browsers to innovate in terms of permission model and associated user interface and protections. Browsers could then in their implementation translate these op permission groups into use cases they enable (e.g. "image classification", "object detection", "noise suppression" etc.), the challenge being there's an op overlap, with foundational ops used across a wide variety of models enabling different use cases. We've produced a mapping from models (use cases) to ops for a set of foundational ops to illustrate this challenge.

HelloFillip commented 3 years ago

I'd like to comment that although the discussion around this focused on GPU (or general hardware usage gating), these permissions are vital to fingerprinting protection (and undoubtedly other) features.

Fingerprinting can be included in #90 but I would like to ensure that gating is not a complete solution to this.

cynthia commented 3 years ago

Currently WebGL gives away quite a bit of information already.

w3c / machine-learning-workshop

Permission model for Machine Learning APIs #72