privacycg / storage-access

The Storage Access API
https://privacycg.github.io/storage-access/
209 stars 27 forks source link

Storage access & first party set / CNAME feedback #15

Closed thezedwards closed 2 years ago

thezedwards commented 4 years ago

In my opinion, on the storage access API - all modern user data privacy frameworks are requiring organizations identify categories of data sharing and most of these user data legal frameworks are requiring that orgs let users segment their access/sharing. The storage access API should tie directly into user data consent-sharing business categories and only let a 1st party request scope/storage into a 3rd party iframe when the consent string is also sent along with the storage access into 3rd party iframe.

Data revocation & deletion requests into the storage access API must have consent IDs corresponding back to users 1st party context and the 1st party data controller is responsible for submitting the 3rd party storage access deletion requests to data processors using the storage access API.

Users should be able to pick/choose which 3rd party companies are allowed to receive access from their 1st party visit and pass it through the Storage Access API and also be able to submit deletion/access requests through the 1st party in a way that makes it possible for the 1st party to submit the deletion request to the 3rd party via a user’s consent strings/IDs.

///

Furthermore, First Party Data Sets (Associated / Affiliated Domains) and any changes towards data sharing across first party data sets (Corporations that own multiple domains), even if the changes don't include "Dynamic first party data sets" - this creates massive user data vulnerabilities if the change is not also associated with "First Party Data Subdomain Authenticity Checks with CNAME to A Record Reverse Lookup Blocking" -- or something that FINALLY starts to address the massive problem of organizations stitching their domains together via subdomains & CNAME DNS mapping to get around technical browser restrictions between 1st/3rd party domains.

It's essential that as First Party Data Sets gets debated and as changes evolve here, that the CNAME subdomain hacks remain top of mind -- this is one of the WORST and MOST PERVASIVE ad technology hacks in existence and it gets worse every day, every week, every month. // Every month that goes by with more restrictions on 3rd party domains without corresponding restrictions on CNAME hacks, pushes ad tech further into a blurred 1st/2nd-party position using subdomain CNAME hacks that not only puts users in jeopardy, but also breaks most of the collaborative tracking infrastructure being discussed.

Also by ignoring CNAME hacks, this is creating an "internet business bubble" that will burst upon the global economy if organizations like PrivacyCG don't start to clearly say "This is bad, this is wrong, and it's going to stop in XX months via ZZY changes."

It's totally fine for this organization and people here to advocate strongly for First Party Data Sets so that their organizations can have SSO/Federated logins and keep the modern web -- but it's NOT okay for those same organizations and individuals to ignore the very real problems being created by CNAME subdomain hacks and how huge numbers of those very same companies that "need SSO/Federated logins" also have subsidiaries or products or benefit from data supply chains that use subdomain hacks to inject/exfiltrate user data in unsuspecting ways.

Thanks for everyone's time on these issues and the discussions.

Sincerely, Zach

johnwilander commented 4 years ago

Hi Zach! Thanks for filing.

In my opinion, on the storage access API - all modern user data privacy frameworks are requiring organizations identify categories of data sharing and most of these user data legal frameworks are requiring that orgs let users segment their access/sharing. The storage access API should tie directly into user data consent-sharing business categories and only let a 1st party request scope/storage into a 3rd party iframe when the consent string is also sent along with the storage access into 3rd party iframe.

The Storage Access API is a web API intended to be standardized. For it to handle something like a "consent string," such a string would have to have a technical specification and be standardized too. If you mean consent strings in some broader sense, I could see the Privacy CG taking on such work if deemed appropriate, but I'm reluctant to incorporate it here. If you mean the wording of the Storage Access API prompt, it's been discussed to some extent in https://github.com/privacycg/storage-access/issues/6.

It feels like a "consent string" would have legal meaning beyond what we typically do in the web standards world. The closest thing I can think of is "purpose string" or "prompt string" which we have discussed regarding the Storage Access API in https://github.com/privacycg/storage-access/issues/6. Experience has shown that giving websites control over anything that is browser UI is an anti-pattern.

What do you envision? Would this "consent string" be completely under the browser's control like today's Storage Access API prompts? If supplied by the website, would it be restricted in some way (length, picked from a menu of what to consent to, localization …)?

Data revocation & deletion requests into the storage access API must have consent IDs corresponding back to users 1st party context and the 1st party data controller is responsible for submitting the 3rd party storage access deletion requests to data processors using the storage access API.

Are you envisioning a consent ID tied to the user granting storage access? Who would create and store the consent ID? Who would have access to it and under which circumstances?

Users should be able to pick/choose which 3rd party companies are allowed to receive access from their 1st party visit and pass it through the Storage Access API and also be able to submit deletion/access requests through the 1st party in a way that makes it possible for the 1st party to submit the deletion request to the 3rd party via a user’s consent strings/IDs.

Today, this is limited to sites that the user uses as first-party website (at least in WebKit's implementation). Are you envisioning a double permission thing where the user first says they want to allow e.g. social.example to request storage access when third-party and then grants or denies storage access as implemented today? Such ideas have been floated for single sign-on purposes.

///

Furthermore, First Party Data Sets (Associated / Affiliated Domains) and any changes towards data sharing across first party data sets (Corporations that own multiple domains), even if the changes don't include "Dynamic first party data sets" - this creates massive user data vulnerabilities if the change is not also associated with "First Party Data Subdomain Authenticity Checks with CNAME to A Record Reverse Lookup Blocking" -- or something that FINALLY starts to address the massive problem of organizations stitching their domains together via subdomains & CNAME DNS mapping to get around technical browser restrictions between 1st/3rd party domains.

First Party Sets were discussed at the last CG phone call so I assume that's why you bring them up. We (WebKit) have filed issues on the First Party Sets proposal and I believe other browser vendors have too.

In terms of the Storage Access API, are there specific things you want to raise in connection to First Party Sets that aren't general concerns about First Party Sets? Are you saying the Storage Access API should or should not consider First Party Sets (if the browser supports First Party Sets)?

It's essential that as First Party Data Sets gets debated and as changes evolve here, that the CNAME subdomain hacks remain top of mind -- this is one of the WORST and MOST PERVASIVE ad technology hacks in existence and it gets worse every day, every week, every month. // Every month that goes by with more restrictions on 3rd party domains without corresponding restrictions on CNAME hacks, pushes ad tech further into a blurred 1st/2nd-party position using subdomain CNAME hacks that not only puts users in jeopardy, but also breaks most of the collaborative tracking infrastructure being discussed.

Also by ignoring CNAME hacks, this is creating an "internet business bubble" that will burst upon the global economy if organizations like PrivacyCG don't start to clearly say "This is bad, this is wrong, and it's going to stop in XX months via ZZY changes."

It's totally fine for this organization and people here to advocate strongly for First Party Data Sets so that their organizations can have SSO/Federated logins and keep the modern web -- but it's NOT okay for those same organizations and individuals to ignore the very real problems being created by CNAME subdomain hacks and how huge numbers of those very same companies that "need SSO/Federated logins" also have subsidiaries or products or benefit from data supply chains that use subdomain hacks to inject/exfiltrate user data in unsuspecting ways.

Thanks for everyone's time on these issues and the discussions.

Sincerely, Zach

hober commented 2 years ago

Looks like this doesn't need to be open anymore.