Self-Review Questionnaire: Security and Privacy

anssiko commented 3 years ago

In preparation for the TAG review #89, it is recommended we complete the Self-Review Questionnaire: Security and Privacy for the WebNN API.

Let's use this issue to document the responses to the following questions 2.1-2.17. Please find more context regarding these questions from the self-review document itself.

2.1 What information might this feature expose to Web sites or other parties, and for what purposes is that exposure necessary?

This feature exposes the navigator.ml.getNeuralNetworkContext() factory that encapsulates the rest of the API surface used to create, compile, and run machine learning networks. The API allows web apps to make use of hardware acceleration for neural network inference.

2.2 Is this specification exposing the minimum amount of information necessary to power the feature?

The API exposes the minimum amount of information necessary to address the identified use cases for the best performance and reliability of results.

2.3 How does this specification deal with personal information or personally-identifiable information or information derived thereof?

No personal information is exposed.

2.4 How does this specification deal with sensitive information?

No sensitive information is exposed.

2.5 Does this specification introduce new state for an origin that persists across browsing sessions?

No.

2.6 What information from the underlying platform, e.g. configuration data, is exposed by this specification to an origin?

No information from the underlying platform is exposed directly. An execution time analysis may reveal indirectly the performance of the underlying platform's neural network hardware acceleration capabilities relative to another underlying platform.

2.7 Does this specification allow an origin access to sensors on a user’s device

No.

2.8 What data does this specification expose to an origin? Please also document what data is identical to data exposed by other features, in the same or different contexts.

The API adheres to the same-origin policy.

2.9 Does this specification enable new script execution/loading mechanisms?

No.

2.10 Does this specification allow an origin to access other devices?

This specification enables access to the underlying hardware used to acceleration neural network inference.

2.11 Does this specification allow an origin some measure of control over a user agent’s native UI?

No.

2.12 What temporary identifiers might this this specification create or expose to the web?

No temporary identifiers are exposed.

2.13 How does this specification distinguish between behavior in first-party and third-party contexts?

At the moment, the feature does not distinguish between first-party and third-party contexts. Since the feature gives developers access to hardware accelerated features of the device, we could make it be a policy controlled feature similar to WebXR and its xr-spatial-tracking feature identifier.

2.14 How does this specification work in the context of a user agent’s Private Browsing or "incognito" mode?

The feature works the same regardless of whether in-private browsing or incognito mode is active.

2.15 Does this specification have a "Security Considerations" and "Privacy Considerations" section?

Work-in-progress at https://github.com/webmachinelearning/webnn/issues/122

2.16 Does this specification allow downgrading default security characteristics?

No.

2.17 What should this questionnaire have asked?

It asked good questions, in particular, 2.15 was helpful for outlining the concerned section.

anssiko commented 3 years ago

@wchao1115 @huningxin @pyu10055 I took the first stab at the responses to this questionnaire that is part of the expected TAG review material, PTAL: https://github.com/webmachinelearning/webnn/issues/119#issue-736973482

anssiko commented 3 years ago

Sent an Early Privacy Review Request to the W3C's Privacy Interest Group welcoming feedback in this issue.

RafaelCintron commented 3 years ago

2.13 How does this specification distinguish between behavior in first-party and third-party contexts?

At the moment, the feature does not distinguish between first-party and third-party contexts. Since the feature gives developers access to hardware accelerated features of the device, we could make it be a policy controlled feature similar to WebXR and its xr-spatial-tracking feature identifier.

2.14 How does this specification work in the context of a user agent’s Private Browsing or "incognito" mode?

The feature works the same regardless of whether in-private browsing or incognito mode is active.

anssiko commented 3 years ago

Thank you @RafaelCintron! With your contributions we now have completed this questionnaire. Let's have a last call for comments before I take a snapshot of the responses for future use (I plan to turn this into an .md file in our repo, so we can revise it if new information becomes available).

All - please review the suggested responses by our next call 21 Jan, 2021.

sandandsnow commented 3 years ago

Some comments and questions coming out of an early privacy review:

What do you think would be appropriate to prevent, deter or minimise sites from misusing or abusing the capabilities of this API? (Note: This is not a problem unique to this API, and perhaps solutions discovered here could help fix problems in JavaScript, WebGL, etc.)
A related question - suppose the API was enabling machine learning use cases involving mouse movements (an example of behavioural biometrics), what are your thoughts about user awareness/consent and mitigations for abuse?
Is the API restricted to first-party contexts? Or do third-party frames have access? (The answer to 2.13 of the Self-Review: Security and Privacy Questionnaire (above) suggests they do, and that you are exploring the potential of a policy-controlled feature approach.) Is there any reason not to simply restrict to first party context? (i.e. what are the likely use cases you envision that would require third-party frames to have access to the API?)
As I understand it, the API operates on the client-side only. This feature might be worth mentioning in the privacy considerations.
It’s great to see a privacy use case (2.1.2 semantic segmentation). It would be nice to see more.

Shrishak commented 3 years ago

Some more questions and comments from the early PING review:

2.2.3 device identification: Is the device information kept locally and not passed on to JS frameworks? I assume it is kept locally. This can be made explicit in the spec.
There is a possibility of timing attacks which may indicate the underlying hardware being used. There is a related open issue: https://github.com/webmachinelearning/webnn/issues/85
Some of the use cases such as face recognition and skeleton recognition have potential privacy implications. User needs to be notified that their microphone maybe switched on as a result of skeleton recognition. There should be some notification to the user that the mic has been unmuted. Else, if there is a false positive (accidental unmute), then the user maybe caught unaware.

anssiko commented 3 years ago

Thanks @sandandsnow and @Shrishak for the early Privacy Interest Group review.

We'll discuss your feedback on today's WebML CG Teleconference – 4 February 2021 - 15:00-16:00 UTC+0. You are welcome to join if you have availability on such a short notice (we're starting in ~35 mins from now).

anssiko commented 3 years ago

For the curious, we ran out of time on our 4 Feb call, so will put this topic again on our 18 Feb call agenda. We welcome any further feedback in this issue.

anssiko commented 3 years ago

Fixed a pointer to the related issue https://github.com/webmachinelearning/webnn/issues/85 in @Shrishak's comment.

anssiko commented 3 years ago

@sandandsnow and @Shrishak thank you for your initial PING feedback. Below is our initial response:

What do you think would be appropriate to prevent, deter or minimise sites from misusing or abusing the capabilities of this API? (Note: This is not a problem unique to this API, and perhaps solutions discovered here could help fix problems in JavaScript, WebGL, etc.)

Other widely available powerful APIs on the web platform have similar computational capabilities. For example, the universally supported WebGL API that is in fact used by the polyfill implementation of the WebNN API. Our group believes this topic would benefit from a web architecture-level discussion and is likely suited for the TAG dissemination.

This broader topic has been discussed in the context of the Web and Machine Learning workshop earlier: https://github.com/w3c/machine-learning-workshop/issues/72

A related question - suppose the API was enabling machine learning use cases involving mouse movements (an example of behavioural biometrics), what are your thoughts about user awareness/consent and mitigations for abuse?

Already without the WebNN API web sites can listen to mousemove events and send that data to the server for processing using machine learning or other techniques. We believe this API does not enable new abuse vectors for network connected devices, but will in fact potentially allow for better detection of abuse, given browsers either could choose to offer affordances to the user to be notified if this API is being actively used, or implement mechanisms to flag out usage of this API in unconventional contexts.

Is the API restricted to first-party contexts? Or do third-party frames have access? (The answer to 2.13 of the Self-Review: Security and Privacy Questionnaire (above) suggests they do, and that you are exploring the potential of a policy-controlled feature approach.) Is there any reason not to simply restrict to first party context? (i.e. what are the likely use cases you envision that would require third-party frames to have access to the API?)

We opened https://github.com/webmachinelearning/webnn/issues/145 that proposes normative changes. @sandandsnow please take a look.

As I understand it, the API operates on the client-side only. This feature might be worth mentioning in the privacy considerations.

Noted in https://github.com/webmachinelearning/webnn/issues/122.

It’s great to see a privacy use case (2.1.2 semantic segmentation). It would be nice to see more.

We will constantly evaluate the highest value use cases for the web and will keep this section up to date with the state of the art considering the constraints of the web platform.

2.2.3 device identification: Is the device information kept locally and not passed on to JS frameworks? I assume it is kept locally. This can be made explicit in the spec.

@Shrishak, we have a clarifying question: what device information is of concern specifically? The WebNN API does not expose specifics of the hardware.

There is a possibility of timing attacks which may indicate the underlying hardware being used. There is a related open issue: #85

This has been labelled with a "ping-tracker" label.

Some of the use cases such as face recognition and skeleton recognition have potential privacy implications. User needs to be notified that their microphone maybe switched on as a result of skeleton recognition. There should be some notification to the user that the mic has been unmuted. Else, if there is a false positive (accidental unmute), then the user maybe caught unaware.

Media Capture and Streams specification defines two policy-controlled features identified by the strings "camera" and "microphone". This seems like a Media Capture and Streams specification issue.

Shrishak commented 3 years ago

@anssiko, thanks for confirming that the specifics of the hardware are not exposed. My question was regards to whether the availability of GPU acceleration was exposed or not.

anssiko commented 3 years ago

Thanks @Shrishak for the clarification. The group will keep the device identification guideline in mind when evolving the API.

I added a "ping-tracker" label to this issue to keep track of the entirety of the PING feedback. This issue that documents early PING feedback will be helpful at the formal wide review step and demonstrates PING has been engaged early on.

sandandsnow commented 3 years ago

Re: "Our group believes this topic would benefit from a web architecture-level discussion and is likely suited for the TAG dissemination", perhaps the time to raise this with the TAG is during their review.

anssiko commented 3 years ago

@sandandsnow, thanks for highlighting this web architecture-level consideration. Please note:

This broader topic has been discussed in the context of the Web and Machine Learning workshop earlier: w3c/machine-learning-workshop#72

@cynthia (aka Sangwhan) from the TAG had a talk about this topic at the workshop, was part of the related workshop discussions and IIRC wanted to bring w3c/machine-learning-workshop#72 up with the broader TAG. I suggest you revisit these earlier discussions and fill in any gaps in our collective understanding of this problem space.

anssiko commented 3 years ago

The group added the initial Security and Privacy Consideration (https://github.com/webmachinelearning/webnn/pull/170) into the Web Neural Network API spec reflecting the review comments in this issue.

This issue will remain open to solicit further feedback from Privacy and Security reviewers.

@sandandsnow please ping (no pun intended) this issue whenever you have additional comments or questions. This issue is tracked in w3cping/tracking-issues#190

We expect to explicitly reach out to horizontal reviewers again when the spec has advanced further. Thank you for your contributions, your early privacy review comments were much appreciated by our group.

anssiko commented 2 years ago

@sandandsnow, heads-up that the WG agreed to request a delta privacy review from PING for WebNN API in https://www.w3.org/2022/02/10-webmachinelearning-minutes.html#t02 Delta here refers to privacy-impacting changes since the previous review took place (documented earlier in this issue).

To ensure we’re using your review time efficiently and minimize back and forth, we’ll send the request when we’ve converged on certain design issues around device types first. We’re targeting CR this year, and want to make sure you have adequate time for final review prior to that transition.

sandandsnow commented 2 years ago

Thanks for the heads-up. We understand things need to move quickly sometimes, but if you are able to give us 4 weeks (time to review and discuss among PING) that would be most appreciated.

anssiko commented 2 years ago

Clearing the security-tracker label given https://github.com/w3c/security-request/issues/22 is now closed as completed. This issue remains open until the privacy review completes.

anssiko commented 1 year ago

The Privacy review was confirmed completed Sep 1, 2022, see https://github.com/w3cping/privacy-request/issues/96#issuecomment-1234419529 Closing this issue as part of gardening activities.

webmachinelearning / webnn

Self-Review Questionnaire: Security and Privacy #119