whatwg / html

HTML Standard

https://html.spec.whatwg.org/multipage/

Other

8.18k stars 2.69k forks source link

Proposal: Storage Access API #3338

Closed johnwilander closed 1 year ago

johnwilander commented 6 years ago

Details to be discussed in the W3C Privacy CG

We've moved this to the W3C Privacy CG where you can file individual issues on the things you want to discuss: https://github.com/privacycg/storage-access

Original issue

Hi! John Wilander from WebKit here. We hope this can extend existing specifications rather than create some whole new spec. It was originally filed under whatwg/dom but I was advised to move it here.

Storage Access API

Problem

Tl;dr: Browsers that block access to third-party cookies break authenticated embeds such as commenting widgets and subscribed video services.

In the context of cross-origin resource loads, cookies are popularly referred to as third-party cookies. In reality, these cookies are often the same as the first-party cookies so what we really mean with third-party cookies is access to first-party cookies in a third-party context.

A browser may have rules for third-party cookies that go beyond cookie policy rules such as scheme, host, path, secure attribute etc. These additional rules may be:

Third-parties aren't allowed to set cookies,
Third-parties have their cookies partitioned or double-keyed, or
Third-parties have no cookie access.

However, certain services are intended to be embedded as third-party content and need access to first-party cookies for authentication. Examples are commenting widgets, video embeds, payment provider integration, document embeds, and social media action widgets. These break if the third-party content has no access to its first-party cookies.

The same problem exists for other kinds of storage such as IndexedDB and LocalStorage, except they are not tied to the HTTP protocol and are typically not used for authentication purposes. From here on we will refer to cookies except when the distinction between cookies and other storage makes sense.

Proposed Solution

Tl;dr: A new API with which cross-origin iframes can request access to their first-party cookies when processing a user gesture such as a tap or a click. This allows third-party embeds to authenticate on user interaction.

We propose two new functions on the document:

partial interface Document {
    Promise<bool> hasStorageAccess();
    Promise<void> requestStorageAccess();
};

The reasons these are on the document is that 1) storage access is granted to the particular document (see Access Removal) and 2) it changes document.cookie.

hasStorageAccess() can be called at any time to check whether access is already granted and it doesn't require user interaction.

requestStorageAccess() should only be called on user interaction such as a tap or a click. It will check a set of rules and grant access if the rules are fulfilled. Access to first-party cookies in the given iframe can be assumed if the returned promise resolves. From that point, any sub resource load in the iframe will have first-party cookies sent and incoming cookies will be set in the first-party cookie jar.

Note that no other third-party resources on that webpage are affected by the storage access status of an individual iframe.

Algorithm for requestStorageAccess()

If the document already has been granted access, resolve.
If the document has a null origin, reject.
If the document's frame is the main frame, resolve.
If the sub frame's origin is equal to the main frame's, resolve.
If the sub frame is not sandboxed, skip to step 7.
If the sub frame doesn't have the token "allow-storage-access-by-user-activation", reject.
If the sub frame's parent frame is not the top frame, reject.
If the browser is not processing a user gesture, reject.
Check any additional rules that the browser has. Examples: Whitelists, blacklists, on-device classification, user settings, anti-clickjacking heuristics, or prompting the user for explicit permission. Reject if some rule is not fulfilled.
Grant the document access to cookies and store that fact for the purposes of future calls to hasStorageAccess() and requestStorageAccess().

Access Removal

Storage access is granted for the life of the document and as long as the document's frame is attached to the DOM. This means:

Access is removed when the sub frame navigates.
Access is removed when the sub frame is detached from the DOM.
Access is removed when the top frame navigates.
Access is removed when the webpage goes away, such as a tab close.

In addition, the browser may decide to remove access on a timeout basis or on some dedicated user action such as switching cookie policy.

WebKit Specifics

WebKit's implementation of Storage Access API will be available in Safari Technology Preview soon and on by default. It only covers cookie access, i.e. no other storage mechanisms and the partitioning of them is affected by a call to requestStorageAccess() at this point.

sechel commented 6 years ago

This is a great suggestion. Currently we are locked out of our own data since we use different subdomains for an iframe and the top level page. We do this for security reasons as we do not want the top level page with all its front-end code to have access to the data that is contained in the iframe's security origin. At the same time we use the same wildcard certificate for both domains.

johnwilander commented 6 years ago

I would like to propose that implicit first party storage access would be granted for a site loaded in a iframe of a parent site when both sites share the same (SAN) SSL certificate. So when either the certificates ‘Subject’ or ‘Subject Alternative Name’ match then grant first party access.

This would not work today since many cloud providers use one certificate for tens if not hundreds of customers, with a bunch of disparate domains in the Alt Name section. And it certainly wouldn't work tomorrow when web trackers make sure that websites deploy certificates with the tracker domain names as Alt Names.

Two years ago, we suggested that EV certs with matching OIDs could work the way you suggest, but there was no interest in the W3C WebAppSec working group for such a mechanism.

Currently we are locked out of our own data since we use different subdomains for an iframe and the top level page. We do this for security reasons as we do not want the top level page with all its front-end code to have access to the data that is contained in the iframe's security origin.

This must be because of some other mechanism. ITP doesn't not consider subdomains. Instead it groups everything into eTLD+1 which means a sub.domain.example iframe always has first-party cookie access under www.domain.example.

sechel commented 6 years ago

This must be because of some other mechanism. ITP doesn't not consider subdomains. Instead it groups everything into eTLD+1 which means a sub.domain.example iframe always has first-party cookie access under www.domain.example.

Sorry for the confusion, I was referring to IndexedDB access which is partitioned (in Safari) and the iframe does not have access to the same data as it would if it was included under a different (sub)-domain.

bvlgn commented 6 years ago

If there is no alternative way for legitimate use cases, like e.g. a cross site shopping cart or silent/auto network login and other use cases too, I think the Storage Request API and any ITP like solutions should not be even considered.

I don’t think we should offer the good for the bad behavior of others.

If we can’t come up with a technical solution that allows legitimate first party cross site tracking while stopping the undesired third party tracking then we should only consider other ways to fight the later. Maybe it wouldn’t be such a bad idea after all if all browser makers came together and started sharing and using actual third party tracker detectors/signatures. Then if such third party tracker was detected while the user set the ‘third-party-tracking-disallowed’ flag then storage access would be denied and a red warning mark should be shown in front of the address bar (maybe even the complete address should become red) or some other way to deter the site owners to not honor the users wish to be not tracked by third parties. Very few site owners would risk that.

If you could get Google, Bing and other search engines on board they could do the same detection and add a pre warning to sites that don’t honor the users wish to not be tracked by third parties and maybe even rank them lower. Then even less site owners would be willing to take that risk.

So now the third party trackers become tracked themselves, i.e. if they misbehave ;->

And while we are at it, let’s publish the list of trackers and number of times they were caught misbehaving so that privacy enforcement agencies (EU – GDPR!) can use that data when they going after them ;-)

If the user didn’t set the ‘third-party-tracking-disallowed’ setting, the user should only be informed with a popup and asked if they want to allow the tracker or not.

If also the tracker detection idea isn’t something you would consider as an alternative solution then maybe you should leave it to the privacy laws or third party browser extension to fight the undesired third party tracking practices?

But whatever you decide, please don’t take away the only way for developers and their users to do cross site first party tracking for legitimate use cases.

johnwilander commented 6 years ago

Benny, we’re not discussing whether browsers should restrict website data or not in this issue. Regardless of what the defaults are and how the restrictions work, we believe there should be a way for third-parties to request access to their first-party data when they are restricted. Our proposal is the Storage Access API.

rstoneIDBS commented 6 years ago

Seconding Benny's final sentence, our Enterprise use case for 3rd party cookies is entirely legitimate and yet is currently broken on Safari unless the users disable the third party cookie prevention option. Our release cycles are such that we can't simply re-engineer several different parts of our application to avoid this problem (aside from having to introduce a fairly ugly workaround from a UX perspective). I really do hope that the various browser vendors can offer users a 'white list' option to allow this blocking behaviour to be disabled for legitimate use cases. I'm all for blocking abuse of cookies, but please consider all users (and use cases) of the web 'platform' when making changes such as this.

bvlgn commented 6 years ago

John,

we believe there should be a way for third-parties to request access to their first-party data when they are restricted

We believe that too, but it should be in such a way that it handles all legitimate use cases, e.g. the once I mentioned. Those don’t and shouldn’t/cannot have iframe user interaction. If that would be possible then ITP + the proposed API would be an improvement.

svieira commented 6 years ago

Echoing @rstoneIDBS we also have an enterprise situation where we're embedded into other sites. This new set of restrictions will require a far worse end-user experience which the new API does not mitigate.

Background

The "what" is forms, reports, and applications (think Wufoo + Tableau). The user is authenticated via the enterprise's IDP. They may already have a session with our application or they may need to establish one. They may not have visited our application directly.

We've been getting around the third-party cookie setting requirement by asking enterprises with high Safari usage to redirect to our application before the user enters into one of the flows where we are embedded. (Outer app -> us (seamless IDP login, we set our auth cookie) -> Outer app with us embedded).

Problem

We do not render in a cross-domain iframe. Our JS code is loaded into an iframe for isolation, but the iframe's host is the parent domain and the iframe therefore is "first party". The DOM we render is rendered into the parent page (developers consume our apps as web-components). There is no way for us, with this set of APIs, to request access to "our" cookies from this context.

The workaround is:

Continue to require the redirect to allow authentication at the top level (otherwise IDP setups that rely on redirection in their auth workflow will not work at all with these new restrictions)
Create a cross-domain iframe to our application and load a http interpreter application
Make the user click on some element inside the iframe and in response we use the new API to unlock the existing authentication cookie
Hide but do not destroy the iframe
Route all of our requests and responses across parent DOM <-> parent-scoped-iframe <-> cross-domain iframe via postMessage.

From the user's perspective, as they move across screens in the parent application they have to randomly click a button to show the screen they just navigated to. How often the button shows up depends entirely on how much of the parent application is an SPA vs. how often it transitions pages via full-page navigation and where in those flows our app is injected.

Just adding another special case for your consideration. I do not have an API proposal (at this time) that cannot be abused by third parties.

rstoneIDBS commented 6 years ago

Our use case is similar to @svieira, our hosted application can be extended by our customers (via IFRAMEs) so the customer code is hosted on separate servers. We support multiple extension points and it is entirely feasible that these could be on different domains as some of our customers farm the extension code out to 3rd parties - thus we cannot sensibly (on start up) visit each domain to implement the 3rd party cookie 'hack' (i.e. the approach Auth0 use). In addition to this our applications are hosted on separate domains and our integration approach for these various applications is again via IFRAMEs, so our own applications also suffer from the 3rd party cookie problem.

jackfrankland commented 6 years ago

@svieira @rstoneIDBS It would be great if webkit provided more comprehensive documentation surrounding this. However, https://webkit.org/blog/8124/introducing-storage-access-api/ and the newer implementation https://webkit.org/blog/8311/intelligent-tracking-prevention-2-0/ seem to provide more information on how Safari will use this API. Reading this it seems that a successful request will last for 30 days (I haven't tested this yet - as I said it would be great for webkit to provide a single source of truth). You may still need to have a user-clickable 3rd party iframe for the initial request, but not have to do this on every visit. Maybe the 3rd party iframe could be hidden at first to check if it has access to storage api?

What ITP 2.0 blog post also mentions is that the third party cookies will be able to be set and read, but partitioned to the first party. If it's possible to authenticate within the embeds, then there won't be a requirement to visit your app in the first party context anymore, or perhaps no requirement to use the storage access api.

What I would like to now know is if a user permits storage access to a domain once, will this apply to all sites that use that domain in a third party context? I'm guessing so, because how else would OAuth in a popup window work?

Has anyone done further testing of this API with ITP 1.1 and ITP 2.0? Am I reading things completely wrong?

wilfrem commented 6 years ago

Our use case is sandbox for user-uploaded HTML5 games. this proposal is to force end-users bad experience, or break user-uploaded HTML5 games.

Background

We serve HTML5 game posting service like YouTube. Any user can upload HTML5 game(without any modification), and any user play games just open link(no dialog and user gesture needed).

one of the game user uploaded: https://game.nicovideo.jp/atsumaru/games/gm3584

We list things related this proposal.

For security reason (Same-Origin policy), we serve user-uploaded game resources from https://html5.nicogame.jp/games/[gameId]/[version]/index.html
For user experience and site functions, we use nested iframes. top: https://game.nicovideo.jp, child: https://html5.nicogame.jp/core/player/index.html, grandchild: https://html5.nicogame.jp/games/[gameid]/[version]/index.html
For access control to game resources, we use third-party cookies for sending token to user-uploaded game resource server (token is passed via postMessage).
Many user-uploaded game use localStorage or indexedDB to save game data.
Our site modify localStorage for cloud save system.
Our site will have embed player like Youtube embed player.

Problem

Game cannot boot until user interaction because this proposal block access to storage until user interaction.(user experience problem)
Game cannot request requestStorageAccess because game parent frame is not top frame.
No way to requestStorageAccess at embed player situation (top: some other site, child: https://game.nicovideo.jp, grandchild and great-grandchild: https://html5.nicogame.jp )
It is necessary to update all games because of this proposal.

We are not use third-party cookies for tracking, but use it for sandbox environment. To solve this problem, should we have a method to let the browser know that these two are different domains but not Third-party?(e.g. DNS information, manifest.json, or third-level domain is same?)

jokeyrhyme commented 6 years ago

@wilfrem

It is necessary to update all games because of this proposal.

This "proposal" is already implemented in Safari for macOS and iOS

Unless you can ignore iOS Safari from your user base (and I assume almost none of us can), then you probably will need to update all the games to align with how ITP 2.0 in Safari works, anyhow

To solve this problem, should we have a method to let the browser know that these two are different domains but not Third-party?(e.g. DNS information, manifest.json, or third-level domain is same?)

I really like this idea, and I think some proprietary systems (Google maybe?) expect this via DNS records currently, but a manifest.json could be a useful alternative, too

johnwilander commented 6 years ago

If it’s just your domain A embedding your domain B in iframes to achieve SOP isolation, ITP will not classify domain B as having tracking abilities and cookies will work as in pre-ITP Safari, i.e. you don’t need to request storage access.

johnwilander commented 6 years ago

Mozilla announced two days ago that they’re implementing the Storage Access API, with some interesting differences from WebKit’s implementation: https://groups.google.com/forum/m/#!msg/mozilla.dev.platform/l8bV4RFgAc4/MKl9jbJpBQAJ

wilfrem commented 6 years ago

@johnwilander

Thank you for your quick response. I'm worried about the ITP method of determining whether domain has tracking ability. Are there any guidelines or advice to prevent false-positive detection? it's helpful for any developer for using iframe as sandbox.

annevk commented 6 years ago

Given that two browsers now want to ship this API in some form it prolly warrants being standardized. Unfortunately, since it's all still rather experimental there needs to be quite some "implementation-defined"-leeway.

rstoneIDBS commented 6 years ago

If this starts to become a standardized behaviour can I please ask the browser vendors to consider different user profiles. This proposal is aimed at preventing 'ad tracking' for internet consumers but unfortunately breaks the only reliable (IFRAME) mechanism used by enterprise cloud aware applications that need to support 3rd party integrations. Allowing enterprise users to disable this feature on a per domain basis (as Chrome currently does) is a much better solution than forcing us to tell our users to disable 3rd party cookie blocking entirely.

kushal commented 6 years ago

@johnwilander

If this is becoming a standard, I'd love to see some thinking about how a slightly larger scope could cover more cases.

As is, StorageAccess faces two hurdles in replacing many current uses of third-party cookies

Permission prompts would be too untargeted and interruptive
Requiring user interaction makes it hard to deliver value on page load

I think there's a small addition that would mitigate both problems while preserving privacy goals: a concept of "storage is present".

Let's start with the the reason StorageAccess feels insufficient as is.

Embedded video or music players that want to save plays to a user's history or take advantage of any subscriptions are encouraged to prompt for StorageAccess when the user hits play. However, this is pretty interruptive (prompting them when they hit play is likely to be perceived as annoying), especially since it is untargeted (prompting all users to find only the subset with subscriptions).
Sites that offer personalization and subscriptions across multiple domains (Gizmodo, Medium, Scroll), or embedded comment services that might rerank or maintain state based on a user account really want to deliver that value on page load. The reactive nature of StorageAccess (only on user action) is not a great fit.

What if we expose a single bit of information: whether storage exists at all?

With this change, all requests to a given third-party domain (iframe or XHR) where cookies are present but elided would include a Cookies-present HTTP header, and user agents would also provide a document.isStoragePresent() function inside of iframes.

An embed could check if storage data is present, and show a call-to-action requesting access only if there is. These calls-to-action could be significantly more prominent because they are targeted. They would provide a path for requesting access in cases where there is no clear user action to tie into (like pageload), and they would provide a path to request access without interrupting some critical flow (like play).

For example, an embed could check !document.hasStorageAccess() && document.isStoragePresent() or make a XHR call to a domain it uses for tracking subscription state, and if true show a button reading "Let your FooPlayer subscription work on this site", and then when the user clicks, it could request StorageAccess.

Fingerprinting is a concern, but easily mitigated

Although by itself this proposal exposes only a single binary bit that is not vulnerable to tracking, a challenge is avoiding HSTS supercookie-style attacks by using several of these bits in concert. Implementations could choose to police this through heuristics, or they could require permission, or use a mix of the two.

For example, when a user logs in at foosubscription.com, the site might call document.permitStoragePresent() which prompts the user "Allow sites to know that you're logged in to foosubscription.com as you browse the web?" The permission could even require periodic reconfirmation from the user. This permission system would require the user to trust and grant this permission across many domains to allow for fingerprinting, which is quite unlikely.

ehsan commented 6 years ago

Hi @johnwilander,

I've implemented this API in Gecko. I have three questions for you for now:

a) Were you planning on writing a description of the algorithm for Document.hasStorageAccess() as well? For the Gecko implementation I had to reverse engineer what WebKit does, which wasn't too much work, but it seems like we'd need to specify that algorithm too.

b) What plans, if any, do you have with regards to web-platform-tests for this feature? I see that WebKit has some WebKit-specific layout tests, and we also have our own tests for the feature. On our side, in order for us to be able to run tests for this feature in web-platform-tests, we'd need to be able to simulate user gestures, as well as call some Gecko-specific test infrastructure setup/teardown functions for each test to prepare the storage access API to be called from origin https://foo.example. I recently learned that web-platform-tests now are able to do all of this (please see the discussion in https://groups.google.com/d/msg/mozilla.dev.platform/l8bV4RFgAc4/lrLvVK1xBQAJ), so I would be pretty interested in collaborating on an effort to get some tests upstreamed to wpt. (Not sure if WebKit runs wpt tests in CI these days or not, but we do.) One thing to figure out, of course, is the behavior differences across Gecko and WebKit as far as the tests go (e.g. Gecko provides access to the full cookie jar rather than just to cookies). Our process for importing wpt tests into our test infrastructure is capable of handling tests that do not pass in Gecko for one reason or another, so that shouldn't be a big issue for us, hopefully.

c) Just a tiny nit on the IDL in the first comment of the issue, per https://heycam.github.io/webidl/#idl-boolean, hasStorageAccess() should be declared to return Promise<boolean>. :-)

Thanks!

johnwilander commented 6 years ago

Splendid news, Ehsan!

a) Let’s fix that.

b) Let’s cooperate on platform tests. I’ll have to look into user gestures. I recall at least one other “by-user-activation” sandbox attribute. Maybe it has platform tests? Are your tests similar to ours except for your broader scope? (When you say “full cookie jar,” I assume you mean other kinds of storage than cookies, such as LocalStorage.)

c) Sure, let’s fix that as part of a).

We’ve also made two recent updates based on developer feedback:

1) We now persist storage access through same-site navigations of the iframe.

2) We now maintain the user gesture status if the document.requestStorageAccess() promise resolves. This allows the iframe to check for cookies and do a popup if the user turns out to not be logged in.

michael-oneill commented 6 years ago

What would be good is a way to pass on information to the user prompt, as well as a revokeStorageAccess(), returning the access state to as it was before requestStorageAccess resolved. This is important in Europe where the GDPR calls for it to be as easy to withdraw consent as to give it. Information could be provided through a dictionary parameter e.g.

requestStorageAccess({ name: 'BigCo Inc', purpose: { category: 'audience measurement', description: 'A longer description of how personal data collected is used and for what purpose' }, maxAge: 1000 * 60 * 60 * 24 * 30

});

The maxAge enables duration for the permission, UAs could allow users to set overall limits

revokeStorageAccess();

GDPR ref:

The data subject shall have the right to withdraw his or her consent at any time. The withdrawal of consent shall not affect the lawfulness of processing based on consent before its withdrawal. Prior to giving consent, the data subject shall be informed thereof. It shall be as easy to withdraw as to give consent. Article 7.3

michael-oneill commented 6 years ago

Also following from my comment 28/3/2018 it would be useful to have a storage-access feature: Feature-Policy: storage-access example.com A site could request that all embeddees used partitioned storage except for an allow-list. The allow-list would still cause a user request prompt if not granted, but at page load. The information provided would come from a /.well-known JSON resource e.g. /.well-known/dnt or an Origin Manifest when we have one.

ojame commented 6 years ago

I know this has already been rolled out in Safari, and it's in progress in Gecko (so I'm late to the party), however we have a use-case that this fundamentally breaks:

We have an application[0] that sets a third-party cookie with your authenticated details
Our customers embed our widget on their website with a snippet of html, which calls a JavaScript file
If the JS file was requested with the authentication cookie, it sends them back the widget code, if not, it sends them back nothing

So basically we conditionally render our widget based on if the user was logged into our main application or not. An important note: Our widget is embedded on our customers production websites, however it doesn't render for the vast majority of users, because they're not authenticated with our main application. A large majority of our customers also embed our widget on multiple websites with different domains.

From my understanding (reading this thread and documentation) this flow is no longer possible with this API because of the need for interaction. The only flow that would work, as far as I can tell:

render an iframe to everyone with a call to action to render our widget
when/if a user clicks the CTA, request permission via this API and then conditionally render the widget

This has its obvious drawbacks, the biggest one being our tool is (generally) intended for internal use (yes, even on production!) but yet to get it displaying at all, we need to render at least something to everyone (whether it's a CTA that requests permissions, or a basic authentication dialogue).

I understand because of the way our application interacts with our widget (third-party cookies), ITP picks this up as essentially being a tracker, but at our core we are not, and we are a genuinely trusted product for our consumers. There is also no way with this API for users of Safari to tell their browser "I understand the side-effects but I trust [domain] indefinitely until I don't, across all websites".

I would love, if someone has faced similar issues or understands where I'm coming from, to open a discussion around this.

[0]: BugHerd

rstoneIDBS commented 6 years ago

@ojame - you aren't the only one, see my comments earlier in the thread. Unfortunately I don't think our opinions carry any weight - there doesn't seem to be a will to consider enterprise applications/users at all. Like you, I can't see why all the browser vendors can't just support a white list (ala Chrome) for those use cases where a (frankly horrible UX experience) workaround won't be sufficient.

michael-oneill commented 6 years ago

Hi James,

Is the authenticated cookie created by a top level (first-party) site?, i.e. when the user logs-in?

Could you ask for informed consent (for the widget showing up on other sites) there when they do that?

There would probably have to be a record of the information provided so it can be checked by the UA if the user requests it, or to remind them later they had given consent for it.

This could be an example when a first-party invocation of requestStorageAccess makes sense, with some record of purpose declared in a /.well-known/ location or similar.

A revokeStorageAccess call available to the domain would make sense also, i.e. if the authentication cookie not present then revokeStorageAccess, though in your case it would just be a belt & braces.

Mike

From: James Coleman notifications@github.com Sent: 09 October 2018 01:59 To: whatwg/html html@noreply.github.com Cc: michael-oneill michael.oneill@baycloud.com; Comment comment@noreply.github.com Subject: Re: [whatwg/html] Proposal: Storage Access API (#3338)

I know this has already been rolled out in Safari, and it's in progress in Gecko (so I'm late to the party), however we have a use-case that this fundamentally breaks:

We have an application[0] that sets a third-party cookie with your authenticated details
Our customers embed our widget on their website with a snippet of html, which calls a JavaScript file
If the JS file was requested with the authentication cookie, it sends them back the widget code, if not, it sends them back nothing

From my understanding (reading this thread and documentation) this flow is no longer possible with this API because of the need for interaction. The only flow that would work, as far as I can tell:

render an iframe to everyone with a call to action to render our widget
when/if a user clicks the CTA, request permission via this API and then conditionally render the widget

This has its obvious drawbacks, the biggest one being our tool is (generally) intended for internal use (yes, even on production!) but yet to get it displaying at all, we need to render at least something to everyone (whether it's a CTA that requests permissions, or a basic authentication dialogue).

I understand because of the way our application interacts with our widget (third-party cookies), ITP picks this up as essentially being a tracker, but at our core we are not, and we are a genuinely trusted product for our consumers. There is also no way with this API for users of Safari to tell their browser "I understand the side-effects but I trust [domain] indefinitely until I don't, across all websites".

I would love, if someone has faced similar issues or understands where I'm coming from, to open a discussion around this.

[0]: BugHerd https://bugherd.com

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/whatwg/html/issues/3338#issuecomment-428025138 , or mute the thread https://github.com/notifications/unsubscribe-auth/AEBCIrpC68VnIfpxfbGh2E6ie1nYf8Zrks5ui_TsgaJpZM4RZsHL . https://github.com/notifications/beacon/AEBCIib-ZC8_dbQFa5H5C5i47WrjJjgXks5ui_TsgaJpZM4RZsHL.gif

johnwilander commented 6 years ago

I know this has already been rolled out in Safari, and it's in progress in Gecko (so I'm late to the party), however we have a use-case that this fundamentally breaks:

We have an application[0] that sets a third-party cookie with your authenticated details

That has never been possible with Safari's default cookie policy, starting 15 years ago. Third-parties without pre-existing cookies cannot set cookies in Safari. I assume you mean your site sets a cookie as first-party when the user logs in and then uses that cookie to authenticate the user when your content is third-party.

Our customers embed our widget on their website with a snippet of html, which calls a JavaScript file

I'm a little confused here. Are your logged in users your customers or other websites your customers?

If the JS file was requested with the authentication cookie, it sends them back the widget code, if not, it sends them back nothing

So basically we conditionally render our widget based on if the user was logged into our main application or not. An important note: Our widget is embedded on our customers production websites, however it doesn't render for the vast majority of users, because they're not authenticated with our main application. A large majority of our customers also embed our widget on multiple websites with different domains.

From my understanding (reading this thread and documentation) this flow is no longer possible with this API because of the need for interaction. The only flow that would work, as far as I can tell:

render an iframe to everyone with a call to action to render our widget

You don't have to do that. You can render the iframe conditionally. You only need the iframe if and when you've decided you need to authenticate the user.

when/if a user clicks the CTA, request permission via this API and then conditionally render the widget

This has its obvious drawbacks, the biggest one being our tool is (generally) intended for internal use (yes, even on production!) but yet to get it displaying at all, we need to render at least something to everyone (whether it's a CTA that requests permissions, or a basic authentication dialogue).

I understand because of the way our application interacts with our widget (third-party cookies), ITP picks this up as essentially being a tracker, but at our core we are not, and we are a genuinely trusted product for our consumers. There is also no way with this API for users of Safari to tell their browser "I understand the side-effects but I trust [domain] indefinitely until I don't, across all websites".

We do not believe users in general understand the consequences of web scale tracking capabilities.

I would love, if someone has faced similar issues or understands where I'm coming from, to open a discussion around this.

Whenever we discuss security or privacy issues, we have to take adversaries into account. In the case of cross-site tracking, we have to assume that trackers will immediately abuse any capability to convince the user to opt out globally for their particular domain or any capability to claim that their particular domain is not a tracker to be able to skip privacy protections. Thus, any suggestions for changing how the Storage Access API works has to cover how we avoid abuse.

Further, this thread is about the Storage Access API as proposed standard web functionality. Moving into the space of managed devices is not handled by web standards. Additionally, ITP is not a standard and not proposed as a standard. So we have to limit what we discuss in this thread to ways to get first-party cookie and website data access under a setting, opt-in or default, in any browser, that restricts a third-party from using its first-party cookies and website data.

johnwilander commented 6 years ago

@ojame - you aren't the only one, see my comments earlier in the thread. Unfortunately I don't think our opinions carry any weight - there doesn't seem to be a will to consider enterprise applications/users at all. Like you, I can't see why all the browser vendors can't just support a white list (ala Chrome) for those use cases where a (frankly horrible UX experience) workaround won't be sufficient.

Is Chrome's whitelist for cookie use by third-parties when the user has opted into only allowing first-party cookies?

As for opinions carrying weight, we take all your comments into consideration. Our primary goal is to protect users' privacy on the web. We're open to any suggestion that makes more legitimate things work while avoiding abuse.

kushal commented 6 years ago

@johnwilander It sounds like @ojame 's needs are similar to the ones I describe above? We would like to show UI only to users who have authentication data rather than to everybody who visits the page (even if the actual data is then only released through a permission dialogue). I'm curious whether you think asking for user permission for this binary logged-in/not-logged-in state could mitigate fingerprinting risk as I describe in my earlier comment?

johnwilander commented 6 years ago

@johnwilander It sounds like @ojame 's needs are similar to the ones I describe above? We would like to show UI only to users who have authentication data rather than to everybody who visits the page (even if the actual data is then only released through a permission dialogue). I'm curious whether you think asking for user permission for this binary logged-in/not-logged-in state could mitigate fingerprinting risk as I describe in my earlier comment?

A way to check state is a proposal we're taking into consideration.

However, "logged-in/not-logged-in" is not something the browser knows. Plenty of sites have cookies set for logged out users and some of those cookies may even be secure and HttpOnly. Allowing third-parties to ask for the existence of specific cookies is an obvious cross-site tracking vector so the only thing on the table is a binary "has cookies/has no cookies." That would mean that sites who want to leverage such an API would have to expire all their cookies at once when the user actively logs out or is timed out. That's how I interpret your request.

johnwilander commented 6 years ago

@johnwilander It sounds like @ojame 's needs are similar to the ones I describe above? We would like to show UI only to users who have authentication data rather than to everybody who visits the page (even if the actual data is then only released through a permission dialogue). I'm curious whether you think asking for user permission for this binary logged-in/not-logged-in state could mitigate fingerprinting risk as I describe in my earlier comment?

A way to check state is a proposal we're taking into consideration.

However, "logged-in/not-logged-in" is not something the browser knows. Plenty of sites have cookies set for logged out users and some of those cookies may even be secure and HttpOnly. Allowing third-parties to ask for the existence of specific cookies is an obvious cross-site tracking vector so the only thing on the table is a binary "has cookies/has no cookies." That would mean that sites who want to leverage such an API would have to expire all their cookies at once when the user actively logs out or is timed out. That's how I interpret your request.

And to be clear, we worry about trackers setting up multiple domains and loading one iframe each from those domains to read out one bit at a time with the proposed "has cookies/has no cookies" API. With those bits they will be able to read a cross-site tracking ID and that we will not allow.

ojame commented 6 years ago

@michael-oneill

Is the authenticated cookie created by a top level (first-party) site?, i.e. when the user logs-in? >Could you ask for informed consent (for the widget showing up on other sites) there when they do that?

Yes, it is, and yes, that sounds like something reasonable. It provides better context, too.

@johnwilander

I assume you mean your site sets a cookie as first-party when the user logs in and then uses that cookie to authenticate the user when your content is third-party.

Yes, sorry for my terminology confusion.

I'm a little confused here. Are your logged in users your customers or other websites your customers?

The authenticated users are our customers (in the context of they're authenticated with our application).

You don't have to do that. You can render the iframe conditionally. You only need the iframe if and when you've decided you need to authenticate the user.

I don't understand how we can render the iframe conditionally. It appears via this api, in order to gain access to our authentication cookie (which we can base our conditional render on), we first need to prompt the user for permission, which means we need to display an initial iframe providing some call to action the user can interact with. That initial iframe needs to be shown to everyone then, because at that point, there's no way to tell if the user has an authenticated cookie or not?

Thus, any suggestions for changing how the Storage Access API works has to cover how we avoid abuse.

I'm not on any disagreement here. I love the philosophy of this API and I understand what we do as a product is quite similar to how trackers work - which makes this discussion challenging!

A way to check state is a proposal we're taking into consideration.

Is there any documentation or link I can check out that goes through this proposal? This might meet our needs.

Thanks for being open to discussion 👍

kushal commented 6 years ago

@johnwilander

A way to check state is a proposal we're taking into consideration.

Awesome, really appreciate that.

However, "logged-in/not-logged-in" is not something the browser knows. Plenty of sites have cookies set for logged out users and some of those cookies may even be secure and HttpOnly. Allowing third-parties to ask for the existence of specific cookies is an obvious cross-site tracking vector so the only thing on the table is a binary "has cookies/has no cookies." That would mean that sites who want to leverage such an API would have to expire all their cookies at once when the user actively logs out or is timed out. That's how I interpret your request.

Definitely. I was imagining many sites, including ours, have some subdomain where cookies should only be present if and only if the user is logged in, i.e. connect.foo.com vs www.foo.com. The browser could still limit the ability to peek at this state to one subdomain per TLD. i.e.

user logs in at www.foo.com
site redirects to page on connect.foo.com that sets a cookie and requests permission from the user to access login state across the web using document.permitStoragePresent()
if donuts.foo.com later tries to request the same permission, it is automatically rejected
if a user logs out of www.foo.com, their connect.foo.com cookie is deleted

And to be clear, we worry about trackers setting up multiple domains and loading one iframe each from those domains to read out one bit at a time with the proposed "has cookies/has no cookies" API. With those bits they will be able to read a cross-site tracking ID and that we will not allow.

I appreciate this. Our hope is that if a domain has to request permission through document.permitStoragePresent() and has to renew that permission every 30 days, that very substantially limits the ability for one malicious actor to get a single user to authorize many bits of information.

In addition, it's possible to imagine various browser heuristics about how many times this is used per page, or how many iframes are opened per script, or patterns of collaboration between domains, but the core permission seems like it would go a long way.

johnwilander commented 6 years ago

@johnwilander

A way to check state is a proposal we're taking into consideration.

Awesome, really appreciate that.

However, "logged-in/not-logged-in" is not something the browser knows. Plenty of sites have cookies set for logged out users and some of those cookies may even be secure and HttpOnly. Allowing third-parties to ask for the existence of specific cookies is an obvious cross-site tracking vector so the only thing on the table is a binary "has cookies/has no cookies." That would mean that sites who want to leverage such an API would have to expire all their cookies at once when the user actively logs out or is timed out. That's how I interpret your request.

Definitely. I was imagining many sites, including ours, have some subdomain where cookies should only be present if and only if the user is logged in, i.e. connect.foo.com vs www.foo.com. The browser could still limit the ability to peek at this state to one subdomain per TLD. i.e.

The majority of authentication cookies I see are set for the site's eTLD+1. They may very well be set on sub.domain.example, but be set for domain.example. For instance, I just tried logging into google.com which happens on accounts.google.com but sets login cookies for .google.com.

Are you saying you often see sites setting login cookies for a specific subdomain so as to not have them be sent in requests to their eTLD+1?

And if we allow subdomain-specific API calls, wouldn't trackers be able to:

Set up bit1.domain.example, bit2.domain.example, bit3.domain.example, … bit32.domain.example.
Create 32-bit tracking ID for a user and create a cookie for the subdomains that correspond to the '1's in the bit string of the tracking ID.
Open 32 invisible third-party iframes and call the proposed API to read out the '1's and '0's.
PostMessage the bits to a master iframe and resurrect the tracking ID.

Is such abuse what you're gating with the additional document.permitStoragePresent() you mention below?

1. user logs in at [www.foo.com](http://www.foo.com)

2. site redirects to page on connect.foo.com that sets a cookie and requests permission from the user to access login state across the web using `document.permitStoragePresent()`

3. if donuts.foo.com later tries to request the same permission, it is automatically rejected

4. if a user logs out of [www.foo.com](http://www.foo.com), their connect.foo.com cookie is deleted
And to be clear, we worry about trackers setting up multiple domains and loading one iframe each from those domains to read out one bit at a time with the proposed "has cookies/has no cookies" API. With those bits they will be able to read a cross-site tracking ID and that we will not allow.

I appreciate this. Our hope is that if a domain has to request permission through document.permitStoragePresent() and has to renew that permission every 30 days, that very substantially limits the ability for one malicious actor to get a single user to authorize many bits of information.

This is starting to sound like a complicated API. Let's see if I follow what you're saying:

The site that wants to enable authenticated embeds logs the user in on connect.domain.example.
As part of the login, a cookie is set specifically for connect.domain.example. No other cookies are ever set for this domain and it is deleted as soon as the user logs out.
After successful login, the page on connect.domain.example calls document.permitStoragePresent() which prompts the user with some UI that explains what "Allow" and "Don't allow" implies for them.
If the user taps/clicks "Allow," now third-party iframes from connect.domain.example can call document.isStoragePresent() and get a resolved promise if there exist cookies for connect.domain.example.

Is this correct?

In addition, it's possible to imagine various browser heuristics about how many times this is used per page, or how many iframes are opened per script, or patterns of collaboration between domains, but the core permission seems like it would go a long way.

We try to avoid heuristics if we can since they are particularly hard for developers to test their sites under.

kushal commented 6 years ago

@johnwilander

Are you saying you often see sites setting login cookies for a specific subdomain so as to not have them be sent in requests to their eTLD+1?

Yeah, I've seen this occasionally, but I guess it's definitely not universal. I think I was also imagining that some sites could add a subdomain specifically for use with this API. However, this conversation is making me realize there's probably a simpler option, see below. :)

Set up bit1.domain.example, bit2.domain.example, bit3.domain.example, … bit32.domain.example. Create 32-bit tracking ID for a user and create a cookie for the subdomains that correspond to the '1's in the bit string of the tracking ID. Open 32 invisible third-party iframes and call the proposed API to read out the '1's and '0's. PostMessage the bits to a master iframe and resurrect the tracking ID. Is such abuse what you're gating with the additional document.permitStoragePresent() you mention below?

There is where I imagined that you would limit use of the permitStoragePresent to once per eTLD+1, regardless of which subdomain it's used on, so that once bit1.domain.example is enabled, bit2.domain.example would just be rejected automatically since the slot for domain.example is already taken.

But an equal part of the value of permitStoragePresent would be to prevent many collaborating eTLD+1s, i.e. bit1domain.example, bit2domain.example, etc., because a user would need to give that permission N times and also keep refreshing those permissions every 30 days, and especially if permitStoragePresent only could be called on user interaction.

This is starting to sound like a complicated API. Let's see if I follow what you're saying:

The site that wants to enable authenticated embeds logs the user in on connect.domain.example. As part of the login, a cookie is set specifically for connect.domain.example. No other cookies are ever set for this domain and it is deleted as soon as the user logs out. After successful login, the page on connect.domain.example calls document.permitStoragePresent() which prompts the user with some UI that explains what "Allow" and "Don't allow" implies for them. If the user taps/clicks "Allow," now third-party iframes from connect.domain.example can call document.isStoragePresent() and get a resolved promise if there exist cookies for connect.domain.example.

Yup, that's the proposal. With one variant that maybe domains with storage present also append a HTTP headers to CORS requests so that some logic could happen even before building an iframe.

But your point is well taken that this is pretty complicated. This made me wonder if you could simplify and detach permitStoragePresent() entirely from cookies themselves (although it might need a different name?). In this model, there's no need to fuss around with a special authentication subdomain. Instead, each eTLD+1 can call document.permitStoragePresent from any subdomain or from the eTLD+1 itself, and that just marks a bit for that eTLD+1 in the browser. Then, any iframe from the same eTLD+1 can call document.isStoragePresent to find out if that bit is present, which implies true. Then if a user logs out, the browser calls document.removeStoragePresent(), or the bit is automatically cleared after 30 days or if the user clears cookies. This is effectively a single-bit cross-domain cookie, one per eTLD+1, that can be set and unset using a simple API that captures explicit user permission. I think this is much easier to wrap your head around. It would effectively be document.advertiseLoggedIn() / document.isLoggedIn().

johnwilander commented 6 years ago

But your point is well taken that this is pretty complicated. This made me wonder if you could simplify and detach permitStoragePresent() entirely from cookies themselves (although it might need a different name?). In this model, there's no need to fuss around with a special authentication subdomain. Instead, each eTLD+1 can call document.permitStoragePresent from any subdomain or from the eTLD+1 itself, and that just marks a bit for that eTLD+1 in the browser. Then, any iframe from the same eTLD+1 can call document.isStoragePresent to find out if that bit is present, which implies true. Then if a user logs out, the browser calls document.removeStoragePresent(), or the bit is automatically cleared after 30 days or if the user clears cookies. This is effectively a single-bit cross-domain cookie, one per eTLD+1, that can be set and unset using a simple API that captures explicit user permission. I think this is much easier to wrap your head around. It would effectively be document.advertiseLoggedIn() / document.isLoggedIn().

At this point I think we should back out a little and look at what we're trying to solve.

A better solution here should simply be something like navigator.setLoggedIn(boolean, [optional sameOrigin], [optional timeToLive]) which by default sets logged in state for the eTLD+1 (sameSite) and allows for optional restriction to only cover the specific domain and an optional timeout.

This navigator.setLoggedIn() API would only be callable when the eTLD+1 is first-party site and the document from where the call is made has received a user gesture. That way, we'd have an accurate state instead of building something off of cookie state.

Then we could match that with the ability to ask navigator.isLoggedIn() in third-party iframes.

We would have to think through possible abuse scenarios but at least the above is simple enough for developers to use.

[Edit: Corrected a .loggedIn() to .setLoggedIn() for consistency.]

kushal commented 6 years ago

At this point I think we should back out a little and look at what we're trying to solve. A better solution here should simply be something like navigator.setLoggedIn(boolean, [optional sameOrigin], [optional timeToLive]) which by default sets logged in state for the eTLD+1 (sameSite) and allows for optional restriction to only cover the specific domain and an optional timeout. This navigator.loggedIn() API would only be callable when the eTLD+1 is first-party site and the document from where the call is made has received a user gesture. That way, we'd have an accurate state instead of building something off of cookie state. Then we could match that with the ability to ask navigator.isLoggedIn() in third-party iframes. We would have to think through possible abuse scenarios but at least the above is simple enough for developers to use.

This makes a ton of sense, much clearer! :) And it would definitely address our needs and what I understand of the needs of others. I'm not even sure supporting the ability to toggle sameOrigin is needed, although the flexibility is nice. If there are still abuse concerns, would love to help work through possible mitigations.

michael-oneill commented 6 years ago

I like this also. So I understand, does setLoggedIn replace requestStorageAccess i.e. an embed would no longer be able to get cross-domain cookies other than by calling setLoggedIn when the user actually logs in on a first-party site? Could we also have an optional parameter for a purpose declaration text string (no markup) to be displayed to the user in the UA prompt?

johnwilander commented 6 years ago

I like this also. So I understand, does setLoggedIn replace requestStorageAccess i.e. an embed would no longer be able to get cross-domain cookies other than by calling setLoggedIn when the user actually logs in on a first-party site? Could we also have an optional parameter for a purpose declaration text string (no markup) to be displayed to the user in the UA prompt?

No. Sorry for being vague.

I meant that navigator.setLoggedIn() and navigator.isLoggedIn() would be independent web APIs, not tied to the Storage Access API. Third-party iframes who want to know whether or not it's useful to request storage access, can first call navigator.isLoggedIn().

Down the road, the Storage Access API could be restricted to require the logged in state to be set for it to provide storage access but that's not necessary.

johnwilander commented 6 years ago

Mozilla has published documentation on this API: https://developer.mozilla.org/en-US/docs/Web/API/Storage_Access_API

woloski commented 6 years ago

Hi @johnwilander,

This is Matias, CTO at Auth0. We provide a service for companies (developers) to implement SSO across their applications. As you know, we have been leading a discussion between Apple and a group of identity companies about ITP2 impact for the last few months. We have an email thread with you guys (and in particular, Jon Davis) where we submitted an analysis of the problem and some mitigation proposals, as summarized here. In the last exchange on that thread, Jon mentioned that our feedback has been accepted - and suggested following up in this thread. After careful read of the thread, we can’t seem to be able to find the changes you made that address our feedback; in the below you can find extra details, in the hope they will help to clarify the situation.

Context / Use Case

We described in the doc the situations in which ITP2 interferes with token renewal perations, but it’s probably worth adding more details here.

Here is a typical use case for us:

User browses to app1.com and clicks Login button
User is redirected to accounts.brand.com (this is powered by Auth0 and runs on customer domain)
User authenticates and a cookie is set on accounts.brand.com (session established)
User gets redirected back to app1.com with a short-lived (2hr) access_token via hash location app1.com/#acces_token=the_token (OAuth2 implicit grant type https://tools.ietf.org/html/rfc6749#section-1.3.2). This token is used in JavaScript http calls from the browser to call some API. Note that app1.com is a different TLD than accounts.brand.com
At some point the access_token expires and app1 needs to get a new one (all through client side). The way this is done is through an invisible IFRAME and postMessage. The IFRAME is pointed at accounts.brand.com with a special parameter. At accounts.brand.com there is a cookie (from the previously established session) and a new access_token is generated and sent back to app1.com through the IFRAME and postMessage.

Effect of ITP2

With ITP2 what happens is that in the last step when browsing to accounts.brand.com inside the invisible IFRAME the cookie is not sent (because ITP2 flags it as a third party cookie), hence we can't renew the access_token for that user (session is not found). On the other hand, using the Storage Access API provides a not ideal experience for the end user. Essentially we would need to make a visible IFRAME with a dialog asking "Do you want to renew your session?".

Suggestion

We have a several customers who are approaching us asking what options they have. A lot of them big brands like DowJones, which apparently got whitelisted at some point in WebKit codebase (https://github.com/WebKit/webkit/commit/d91eb617805ac70fc01f5aa3201086c3b2c15d56).

One of the proposals we devised with the other identity vendors in our original doc submission was doing some sort of "pre-booking" when doing the first login on the accounts.brand.com domain. An option (without much thinking) would be to allow multiple Domains on cookie or something similar (Set-Cookie: auth0=....; Secure; Domain=accounts.brand.com; SafeDomains=app1.com,app2.com ). This would signal the browser that these are safe domains to send the cookie to. There might be other options but from what I see in the thread (isLoggedIn or similar APIs) would not help with this use case. Could you please comment on how the changes you introduced address the issues we described, and/or clarify Jon’s suggestion that the feedback in the doc has been addressed?

Thanks in advance.

johnwilander commented 6 years ago

Hi @johnwilander,

Hi Matias! And thanks for your feedback.

This is Matias, CTO at Auth0. We provide a service for companies (developers) to implement SSO across their applications. We have an email thread with Jon Davis who suggested following up in this thread.

Context / Use Case

Here is a typical use case for us:

* User browses to `app1.com` and clicks Login button

* User is redirected to `accounts.brand.com` (this is powered by Auth0 and runs on customer domain)

* User authenticates and a cookie is set on `accounts.brand.com` (session established)

* User gets redirected back to `app1.com` with a short-lived (2hr) access_token via hash location app1.com/#acces_token=the_token (OAuth2 implicit grant type https://tools.ietf.org/html/rfc6749#section-1.3.2). This token is used in JavaScript http calls from the browser to call some API.
  Note that `app1.com` is a different TLD than `accounts.brand.com`

(I believe you mean eTLD+1 since the TLD is .com.)

* At some point the access_token expires and app1 needs to get a new one (all through client side). The way this is done is through an invisible IFRAME and postMessage. The IFRAME is pointed at `accounts.brand.com` with a special parameter. At accounts.brand.com there is a cookie (from the previously established session) and a new access_token is generated and sent back to app1.com through the IFRAME and postMessage.
Effect of ITP

With ITP what happens is that in the last step when browsing to accounts.brand.com inside the invisible IFRAME the cookie is not sent (because ITP flags it as a third party cookie), hence we can't renew the access_token for that user (session is not found).

All the steps up until this point are exactly how cookie-based cross-site tracking works. For instance:

The user browses to news.example and clicks a "Read more" button.
The user is redirected to tracker.example.
A cookie is set on tracker.example.
The user gets redirected back to news.example where an invisible iframe from tracker.example now has access to its tracker cookie.
In addition, tracker.example now has access to its tracker cookie at web scale. Any site that includes a subresource from tracker.example allows tracker.example to track the user on that site.

Any changes or additions we make to the Storage Access API need to take abuse cases into account.

On the other hand, using the Storage Access API provides a not ideal experience for the end user. Essentially we would need to make a visible IFRAME with a dialog asking "Do you want to renew your session?".

Making cross-site tracking capabilities visible, transparent, and under the user's control is a goal of the Storage Access API.

Suggestion

We have a several customers who are approaching us asking what options they have. A lot of them big brands like DowJones, which apparently got whitelisted at some point in WebKit codebase (WebKit/webkit@d91eb61).

We suggested to Jon doing some sort of "pre-booking" when doing the first login on the accounts.brand.com domain. An option (without much thinking) would be to allow multiple Domains on cookie or something similar (Set-Cookie: auth0=....; Secure; Domain=accounts.brand.com; SafeDomains=app1.com,app2.com). This would signal the browser that these are safe domains to send the cookie to. There might be other options but from what I see in the thread (isLoggedIn or similar APIs) would not help with this use case.

Are you suggesting cross-site access to cookies? That sounds dangerous from both a security and privacy perspective. What would stop tracker.example from "pre-booking" safe domains news.example, store.example, blogs.example, and restaurant.example? And how should a user understand and be in control of such an invisible cross-site data system?

woloski commented 6 years ago

Are you suggesting cross-site access to cookies? That sounds dangerous from both a security and privacy perspective. What would stop tracker.example from "pre-booking" safe domains news.example, store.example, blogs.example, and restaurant.example? And how should a user understand and be in control of such an invisible cross-site data system?

The user should be in control by consenting access to storage similar to what SFAuthenticationSession API does in iOS. This alternative is listed in the doc we shared with Jon. He said that he took our feedback. Do you know what was he referring to?

johnwilander commented 6 years ago

Are you suggesting cross-site access to cookies? That sounds dangerous from both a security and privacy perspective. What would stop tracker.example from "pre-booking" safe domains news.example, store.example, blogs.example, and restaurant.example? And how should a user understand and be in control of such an invisible cross-site data system?

The user should be in control by consenting access to storage similar to what SFAuthenticationSession API does in iOS. This alternative is listed in the doc we shared with Jon. He said that he took our feedback. Do you know what was he referring to?

I would like to avoid pulling in other communications into this WHATWG HTML issue.

Please explain here what your suggested changes to the Storage Access API are and we can discuss them. We have two browsers implementing as of today and this thread is about what should go into the standard behavior.

chrisdavidmills commented 6 years ago

We've documented this at https://developer.mozilla.org/en-US/docs/Web/API/Storage_Access_API

domenic commented 6 years ago

I'm starting to feel increasingly guilty that we haven't added this to the spec yet. Let's get started on this.

Do Mozilla and Apple implementers feel that the algorithm in the OP is complete and accurate, or have things perhaps changed in recent discussions? I'm especially interested in hearing from Mozilla, as they had to implement from scratch; was the OP enough information?

In terms of writing a spec PR, is anyone up for volunteering?

luisrudge commented 6 years ago

@johnwilander Hi 👋 I work at Auth0 as well. I'll quote the doc:

Mitigation 2: introduce Storage Access API to request access from a different domain

Another possibility would be to allow identity providers to pre-book storage access for a given domain during the interactive authentication phase and have the browser record the user’s preference in advance.

Imagine that the application at video.example is outsourcing authentication to identityprovider.example. As a user navigates to video.example for the first time, she is redirected (either via full page redirect or via popup) to identityprovider.example, where she is prompted for authentication and/or consent. After the user successfully authenticates, identityprovider.example performs a call to a new API - that might look like document.requestStorageAccess('video.example') - to prompt the user for consenting for the domain identityprovider.example to access its own cookies in a 3rd party context from video.example. Upon receiving consent from the user, the browser would record that preference and use it later on during background token renewals via iframe without the need to pop a prompt out of context. This approach would preserve the spirit of the ITP2 feature, presenting the user with a prompt that doesn't exist today to inform her of the ability of the IdP to track her- while at the same time ensuring that such prompt is shown at a time during which the user expects interactivity.

How would this look like?

Similar to iOS sfauthenticationsession API, the call to action to authenticate could be the end-user interaction with a prompt that is shown before the IdPs authorization endpoint is opened the first time. Clicking on Continue would pre-approve identityprovider.example storage to be accessed in 3rd party context from video.example.

Example

“video.example” Wants to use “identityprovider.example” to Sign In.

This allows the exchange of information about your
secure authentication session while you are browsing
video.example without further interruptions.

[Cancel] [Continue]

@chrisdavidmills if you can validate this form mozilla's side as well, this would be great 😬

@domenic Maybe it's best to wait a little before we move forward with this. In the current state, every 'silent authentication token renewal' operation would not work when ~~the Identity Provider is flagged as a tracker~~ 3rd party cookies are disabled.

johnwilander commented 6 years ago

@johnwilander Hi 👋 I work at Auth0 as well. I'll quote the doc:

Mitigation 2: introduce Storage Access API to request access from a different domain

Another possibility would be to allow identity providers to pre-book storage access for a given domain during the interactive authentication phase and have the browser record the user’s preference in advance.

Imagine that the application at video.example is outsourcing authentication to identityprovider.example. As a user navigates to video.example for the first time, she is redirected (either via full page redirect or via popup) to identityprovider.example, where she is prompted for authentication and/or consent. After the user successfully authenticates, identityprovider.example performs a call to a new API - that might look like document.requestStorageAccess('video.example') - to prompt the user for consenting for the domain identityprovider.example to access its own cookies in a 3rd party context from video.example. Upon receiving consent from the user, the browser would record that preference and use it later on during background token renewals via iframe without the need to pop a prompt out of context. This approach would preserve the spirit of the ITP2 feature, presenting the user with a prompt that doesn't exist today to inform her of the ability of the IdP to track her- while at the same time ensuring that such prompt is shown at a time during which the user expects interactivity.

How would this look like?

Similar to iOS sfauthenticationsession API, the call to action to authenticate could be the end-user interaction with a prompt that is shown before the IdPs authorization endpoint is opened the first time. Clicking on Continue would pre-approve identityprovider.example storage to be accessed in 3rd party context from video.example.

Example
“video.example” Wants to use “identityprovider.example” to Sign In.

This allows the exchange of information about your
secure authentication session while you are browsing
video.example without further interruptions.

[Cancel] [Continue]

First, there is currently no way to know or enforce that a user actually signs in and that such a prompt would be used exclusively for login purposes. The web is an open platform where any site can call any API at any time. So sites that don't even have a login mechanism may leverage this proposed API to facilitate cross-site tracking.

Second, we would have to allow multiple domains to silently get access to their cookies since some sites have integrations with multiple identity providers. So it can't be "Now the user has stated their login provider and no other domain can call this API for the purposes of video.example logins for this user."

Third, such an API is outside video.example's control. Some arbitrary site may claim it's the login provider for video.example and get access to its cookies under video.example. The current API is called when on video.example and video.example can sandbox its iframes if it wants to restrict calls to the API.

Finally, I don't see why identityprovider.example needs permanent access to its cookies under video.example. Why can't it hand over a unique token to video.example that video.example uses to refresh it's login status?

@chrisdavidmills if you can validate this form mozilla's side as well, this would be great 😬

@domenic Maybe it's best to wait a little before we move forward with this. In the current state, every 'silent authentication token renewal' operation would not work when the Identity Provider is flagged as a tracker.

The Storage Access API does not involve any notion whether a domain is a tracker. It only deals with when the user agent, for some reason, has decided to restrict cookie access to a third-party. This may be because the browser by default only allows first party cookies or that the user has opted in to such a setting.

luisrudge commented 6 years ago

First, there is currently no way to know or enforce that a user actually signs in and that such a prompt would be used exclusively for login purposes. The web is an open platform where any site can call any API at any time. So sites that don't even have a login mechanism may leverage this proposed API to facilitate cross-site tracking.

Yes, the message is not important. The gist of it should be: 'Hey, this website is asking to access your information in a non-interactive way. are you ok with this?'

Second, we would have to allow multiple domains to silently get access to their cookies since some sites have integrations with multiple identity providers. So it can't be "Now the user has stated their login provider and no other domain can call this API for the purposes of video.example logins for this user."

Absolutely. It's up to each identity provider to ask permission to use the cookies at their own time. If the user is trying to authenticate with identityprovider.example, then this IdP would call document.requestStorageAccess('video.example') when the user successfully authenticates with them. If the user is trying to authenticate with another-idp.example, then this other IdP would call document.requestStorageAccess('video.example') in the same way.

Third, such an API is outside video.example's control. Some arbitrary site may claim it's the login provider for video.example and get access to its cookies under video.example. The current API is called when on video.example and video.example can sandbox its iframes if it wants to restrict calls to the API.

Yes, and that's why browsers should control what kind of message is sent to the user. If the browser shows that identityprovider.example is asking for info, then the user will trust that it's the correct site. If it's some kind of attack, the user would see: my-malicious-idp.example.

The Storage Access API does not involve any notion whether a domain is a tracker. It only deals with when the user agent, for some reason, has decided to restrict cookie access to a third-party. This may be because the browser by default only allows first party cookies or that the user has opted in to such a setting.

You're right. I'll update the comment.

luisrudge commented 6 years ago

Finally, I don't see why identityprovider.example needs permanent access to its cookies under video.example. Why can't it hand over a unique token to video.example that video.example uses to refresh it's login status?

What you are describing is tantamount to using a Refresh Token in a user agent. The entire OIDC session castle is based on the session being represented by a cookie, managed by the Identity Provider. Single Sign Out can be achieved by deleting that cookie with little or no work from web applications. If now we introduce Refresh Tokens, all of that castle will crumble; every application will have its own session artifact instead of sharing one and we’ll have the problem of explicitly handling storage, as opposed to simply rely on browser features as we do for cookie. Also, this will introduce asymmetries on how session is handled depending on whether the app is implemented with redirects or SPAs (as the former will keep using cookies). It’s not a scenario recommended or even threat modeled by the standard bodies, and a huge change for every Identity Provider.

chrisdavidmills commented 6 years ago

@chrisdavidmills if you can validate this form mozilla's side as well, this would be great 😬

I'm just the documentation guy; this would be better commented by @ehsan, if comment is needed from our side ;-)

Previous Next