WebKit / standards-positions

WebKit's positions on emerging web specifications
https://webkit.org/standards-positions/
240 stars 18 forks source link

CHIPS (Cookies Having Independent Partitioned State) #50

Open johannhof opened 1 year ago

johannhof commented 1 year ago

Request for position on an emerging web specification

Information about the spec

Design reviews and vendor positions

Anything else we need to know

CHIPS is a proposal for a new cookie attribute, Partitioned. This attribute will indicate to user agents that these cross-site cookies should only be available in the same top-level context that the cookie was first set in.

CHIPS has been discussed in PrivacyCG and beyond for a while, and participants from other browsers including Safari (@johnwilander) have generally signaled support so far (with a few details to figure out), see https://github.com/privacycg/proposals/issues/30.

johnwilander commented 1 year ago

Thanks for filing, Johann. The team is discussing, including CFNetwork.

johannhof commented 1 year ago

An additional note that we've updated the proposal to improve the worst-case memory overhead from partitioning, which was a major concern from WebKit as we understood it.

annevk commented 1 year ago

Alas, it's still a major concern. What's important context here is that WebKit currently does not include any cookies on typical cross-site requests. As such, those requests typically fit in a single packet and no memory has to be used for cookies. From that perspective 10 KiB per partitioned site is still substantial.

If the limit could be brought down even further such that the impact is negligible, ideally demonstrated with metrics, we'd definitely consider supporting it.

LGraber commented 1 year ago

Can you articulate the concern a bit more.? Currently, a large number of apps simply tell people that they cannot be used on Safari because Safari has chosen to turn off cookies. On iPhones, for Mobile Apps, the answer is either to do a lot of hacking which likely ends up using real headers instead of cookies (and hence there is no real cost change for these apps if they chose to use cookies) or simply telling the consumer to turn on 3P cookies for it to work. (or something with shard domains to avoid being 3P). Given that this is an opt-in feature, for existing apps, there is no change. Is the concern that new apps will start running on iPhone and harm other existing apps in some way? For native apps using embedded webkit, I would think the process space would be isolated from harming anything else. For the browser, I guess the question is about this performance tradeoff which I don't totally understand yet vs actually being able to support a large set of non-tracking, valid-cookie-usage apps? I would really like to have apps (including mine) work a lot better / work on iPhone without asking customers to do workarounds and without having to hack up our code to try and work around valid usage of cookies. Thanks!

annevk commented 1 year ago

Unfortunately I don't really understand your comment beyond the first question. You make a number of unsourced assertions and certain statements are simply false, e.g., there's no way to set request headers for the majority of requests under discussion here.

Our assumption is that essentially all cross-site responses will adopt partitioned cookies to near the maximum allowed. As the current maximum is 0, the difference between those is our worry.

LGraber commented 1 year ago

okay ... let me try again. I am trying to understand whether your concern is that performance will decay for today's unchanged sites or if allowing this creates a potential for performance issues in the future if people use this feature. From your comment, it sounds like you agree that adding support for it, given that it is opt-in, will have no immediate observable change in serving today's web-requests (that do not use this feature since it does not exist yet). Your concern appears to be that people will use the feature and that assuming "worst" case, they use all allowed resources (which is reasonable), that it will have a negative performance impact.

Can you clarify the performance impact? I assume you are not talking about it being slower to send requests to those sites since they chose to have cookies. Is your concern that some site will be 3P on many tabs in different 1P contexts in the same browser session and will then eat up a lot of memory and degrade performance on the users machine overall or perhaps just on the 1P site which embeds it?

Would it work to say that browsers could still impose some cumulative max cookie resource usage that is the same as what is imposed on 1P sites (ie add up the cookie resource usage for embeddedSiteA across all of its 1P sites and its own 1P usage)? I dont know how easy that would be too manage as the cookie jars themselves are separate but they would now have some shared resource constraint. That would basically make them match existing cookie limits. I am not sure I am undersanding your concern correctly but if I did ... is that what you are looking for to prevent resource explosion?

Note that Firefox is already doing this and not seeing anything blow up but, again assuming I am understanding you correctly, you just want to make sure it wont.

annevk commented 1 year ago

I actually am talking about it being slower to make requests and requests taking up more packets. Also about it taking up more memory overall. And yes, we're definitely assuming sites will set as many of these as allowed, as that would follow the precedent they've set with non-partitioned cookies.

And no, you can't combine the partitioned limit with the non-partitioned limit. That would enable tracking.

martinthomson commented 1 year ago

we're definitely assuming sites will set as many of these as allowed

There are certainly sites out there that load up cookies, but that is often a symptom of institutional dysfunction when it happens. That it is possible to tank your own performance metrics is not something that I think we need to save site developers from by imposing tighter restrictions.

I appreciate that you can operate a perfectly good site with in the order of 30 bits of cookie, but there are good reasons not to strive for perfectly minimal designs. Obviously a smaller number is better, but limits are not ideals. 10k is ridiculous, but the point in setting a ridiculous limit is that reasonable people will never need to think about it.

johannhof commented 1 year ago

Hey Anne, thanks for the additional feedback! I discussed this with others on the Chrome side.

We generally feel comfortable with the current size, because:

But we are not too attached to the exact number and are very open to revisiting it, as long as we can ensure the second point holds.

However, we’re hesitant to commit additional resources into producing a new proposal based on your request to “bring this number down”. We shipped this feature with a number that was based on an extensive discussion on the limit last year, which was largely to address WebKit concerns. We would instead appreciate it if WebKit folks could consult their own metrics and define what you think is a reasonable upper bound and why.

annevk commented 1 year ago

Couple thoughts:

johnwilander commented 1 year ago

In conversations with web developers, for instance payment providers, we've heard a desire for partitioned storage. When we tell them that WebKit has had partitioned LocalStorage available to them for ten years, they typically say "Oh, OK. Let's use that instead of cookies then."

Moving away from cookies as a storage mechanism is desirable for network performance in general. There is one thing that HTML storage lacks and that is a developer-controlled expiry mechanism. That lack drives some developers to use cookies instead, for instance to be able to have a privacy policy that guarantees a time limit on client-side storage. I think it would be more fruitful to add an expiry mechanism to HTML storage than to keep using cookies for storage.

Sora2455 commented 1 year ago

Anecdotally, I work on a SaaS provider that embeds forms in third-party websites, usually used as "contact us" forms. Every cookie we ever set (including those set by our CDNs and hosting providers) adds up to just under 1,100 bytes - but the actual cookies we rely on to provide our service (session cookie, anti-forgery cookie, etc) is less than 300 bytes (most of that the anti-forgery token). So 1,000 bytes would be more than enough for us.

Whenever we detect an inability to set cookies, we show a page asking users (nearly always Safari users) to open the form in a new tab. While theoretically we could develop our own session identifier manually attached to all requests client-side and specifically parsed server-side to basically re-invent server-side session, there is no enthusiasm for that from management. Partitioned cookies would remove the need for us to do this.

ddworken commented 11 months ago

Adding another piece of feedback, I work at Google on our security engineering team. We've been consulting with a number of products to help them ensure they can work securely without relying on third-party cookies.

One of the common issues we've run into is with the usage of sandbox domains (like googleusercontent.com) which are purposefully cross-site from our main applications to ensure that it is safe to host user-controlled content there without an XSS impacting our main applications. There are a number of different Google services that rely on loading sandbox domains as subresources (often as iframes) where a lack of third-party cookies leads to breakages. Partitioned cookies are often sufficient to fix these breakages without any additional work from product teams. This makes CHIPS quite appealing to us, since it is both easy to deploy (just a new cookie attribute) and it doesn't require reworking existing flows that rely on cookies. While it often is theoretically possible to refactor applications to use partitioned LocalStorage, this can require very significant refactorings for applications that expect to receive cookies in the initial page request itself (e.g. server-side rendered applications as opposed to single page apps).

johnwilander commented 11 months ago

@Sora2455 and @ddworken, thanks for your input! Could you share whether your use cases for partitioned cookies is for client-side storage or needs to be sent in every network request to the server? Thanks!

ddworken commented 11 months ago

For us, being sent in every network request to the server is a core part of the use case. The reason is that the initial server response itself benefits from having cookies so that it can make authentication decisions about the user and return personalized content. Without this, applications will have to fundamentally refactor how they work to make it possible to use client-side storage.

For example, imagine a cross-site subresource request for an image. With partitioned cookies, this will automatically get access to the partitioned data in the initial network request. With partitioned client-side storage, it would require refactoring the application to actually load it as an iframe, to have a shim HTML page that uses LocalStorage to access partitioned data, and then use that to fetch the actual image and render it.

johnwilander commented 11 months ago

The reason is that the initial server response itself benefits from having cookies so that it can make authentication decisions about the user and return personalized content.

We have always been wary of users authenticating in cross-site iframes. That's an anti pattern for phishing reasons. Users have no way of telling the origin of the iframe and thus whether or not they should authenticate. That's in part why we proposed and implemented the Storage Access API. It is deliberately designed to support "authenticated embeds."

Could you share a little bit on how you do authentication for cross site iframes and how you view the phishing risk? Thanks!

Sora2455 commented 11 months ago

In my case, I don't need users to log in - we just use session cookies to keep track of their previous actions.

E.g. in a "contact us" form, we'd ask them in one request for their name and email, then in a follow-up page ask them details like e.g. interest groups. We have to correlate the two requests on the server side to make sure they're stored together.

As I said before, this was originally built using session cookies. While there's no technical reason it couldn't be rebuilt with client-side identifiers manually attached to each request, there is no appetite to do that from management.

johnwilander commented 11 months ago

Maintaining a session is reasonable and doesn't require authentication. That's the original intent of cookies. Thanks for sharing.

ddworken commented 11 months ago

Could you share a little bit on how you do authentication for cross site iframes and how you view the phishing risk?

The details often depend on the product (since there are a number of products that fall into these patterns). In some products, the use case is closer to @Sora2455's use case of maintaining sessions (that aren't necessarily tied to a specific authenticated user).

In other cases, the flow is generally something akin to:

In this case, there isn't necessarily a phishing concern since the top-level URL is foo.google.com and we maintain control of the content rendered inside the iframe.

If you're interested in a concrete example of this pattern, Google Cloud Shell is one example. In this case, the top-level window is shell.cloud.google.com (a trusted domain) and the iframe is to *.cloudshell.dev that is used to render authenticated but untrusted content (in this case, the IDE itself which is served from a VM under user control and thus isn't trusted to execute JS on google.com). See here.

claudevervoort commented 11 months ago

Hello,

We're Claude and Peter, both members of the LTI (Learning Tools Interoperablity) Working Group and Learning Tool developers. We'd like to explain how we see Partitioned Cookies/CHIPS as being a particularly important specification for the future of Learning Tools.

Understanding Learning Tools and their Interoperability

To lay the groundwork, let's first define what we mean by 'Learning Tools'. These are web applications integrated into online learning platforms to enhance their offerings, featuring everything from rich third-party content to interactive simulations and assessments. They allow instructors to embed in their online courses content and activities offered by third-parties. For example, an instructor may embed in her course a simulation, or content from an external publisher. This would appear alongside native content and tools.

For a smooth incorporation of these tools into an online course, a standard known as Learning Tools Interoperability (LTI) comes into play. It provides a framework for embedding tools into a learning platform, and you can find more about it here.

The Challenge: Third-Party Cookies and IFrames

Most Learning Tools are embedded within the platform's interface, or 'chrome', in an IFrame. This situation inherently turns them into third-party applications as they often operate from a different domain than the host institution.

Many of these tools rely on cookies for a myriad of functions like session management and CSRF protection. However, the increasing restrictions around third-party cookies have forced these applications to adopt less desirable solutions. These range from requesting users to enable third-party cookies to opening the tool in a new window or tab. Both solutions significantly degrade the user experience.

To address this problem, the LTI working group, alongside major platform vendors, proposed an alternative solution that leverages the window's postMessage functionality, explained in more detail here.

However, as promising as this approach might sound, it still poses significant challenges. The primary issue is that it necessitates a significant departure from the conventional use of cookies, requiring a substantial rewrite of tool code. Furthermore, it fails to provide some of the inherent security features that cookies offer, such as HTTP-only flags.

The Promise of Partitioned Cookies/CHIPs

Enter the world of Partitioned Cookies, also known as CHIPs. This innovative solution could potentially revolutionize the way Learning Tools operate within IFrames.

As you know, CHIPs offer a unique type of cookie that developers can opt to place into partitioned storage. What this means is that each top-level site has its own separate 'cookie jar' or storage. This unique storage feature ensures that a third-party cookie is tied specifically to the top-level site where it was initially set, and it cannot be accessed from any other domain.

This nuanced feature of CHIPs allows cookies to be set by third-party services while ensuring they are only readable within the context of the initial top-level site. The primary goal of this arrangement is to successfully block cross-site tracking, a significant concern with traditional third-party cookies, while still allowing for non-tracking uses of these cookies.

In essence, CHIPs offer a promising solution that maintains essential functionalities while significantly enhancing privacy protection. This offers a simplified adaptation path for developers working on Learning Tools, while ensuring an uncompromised user experience and security.

The great advantage of this approach is that it requires minimal modifications to existing tools. The partitioned nature of these cookies often doesn't pose an issue, as many tools primarily focus on delivering the embedded experience. Once the IFrame is closed, there's no requirement for persistent state in the user agent.

Conclusion

To sum it up, Partitioned Cookies or CHIPs offer a promising solution for Learning Tools in their ongoing struggle with third-party cookie restrictions. With the potential to deliver a seamless user experience without compromising security, CHIPs could be the easy-to-implement solution that many developers have been eagerly waiting for.

Of course, the future of web development is always in flux, and new standards and technologies are continually emerging. However, for now, CHIPs appear to be an efficient path forward for the enhancement of Learning Tools Interoperability and we'd be thrilled as many others in our Learning Tool community to see it supported more widely.

Peter Franza (42 Lines) Claude Vervoort (Cengage Group)

gogasca commented 10 months ago

Colab Enterprise's main user interface is hosted on Google Cloud Console at console.cloud.google.com

In the authentication flow Vertex Inverting Proxy use a cookie (DATALAB_TUNNEL_TOKEN) which facilitates authentication between the Front End (Browser) and a Virtual Machine (VM) in Google Cloud infrastructure.

To establish connectivity between the Front End and the VM, we rely on the following domains:

Example: https://dua2sz3jlwklw-dot-us-central1.aiplatform-notebook.googleusercontent.com/

We have received reports from some customers who are experiencing issues connecting when Third-party cookies are blocked. (Chrome & Safari browsers)

Colab Enterprise is an embedded (iFrame) part of the Google Cloud Console. Colab Enterprise establishes a connection to *googleusercontent.com domain. This URL is dynamic as example above. In order to authenticate a cookie CHIPS.

image

CHIPS will be a preferred solution over First-party sets + Storage Access API (SAA) but Safari is not supported currently.

hober commented 8 months ago

@annevk this came up during last week's Privacy CG call. IIRC there was some cross-browser appetite for reducing the storage limit, though I'm probably misremembering. Have we gotten closer to support?

annevk commented 8 months ago

Indeed, based on discussions with colleagues we'd be happy to fully support this effort (i.e., label this as "position: support") provided we further reduce the overall per-site per-partition limit to 1 KiB and carefully define what happens when this limit is reached. I filed https://github.com/privacycg/CHIPS/issues/74 so that can be properly considered and further discussed.

aselya commented 5 months ago

As mentioned in https://github.com/privacycg/CHIPS/issues/40#issuecomment-1883726735, the keying of the CHIPS PartitionKey attribute is changing to include a cross-site ancestor chain bit.