WebKit / standards-positions

WebKit's positions on emerging web specifications
https://webkit.org/standards-positions/
251 stars 21 forks source link

Shared Storage #10

Open marcoscaceres opened 2 years ago

marcoscaceres commented 2 years ago

Request for position on an emerging web specification

Information about the spec

Design reviews and vendor positions

Bugs tracking this feature

Anything else we need to know

On WebKit-dev, Eric Trouton wrote:

Hi Webkit-Dev Team,

We've been working on the Shared Storage API that supports a variety of use cases that may be impacted by 3rd Party Cookie Deprecation. The idea is to provide a storage API (named Shared Storage) that is intended to be unpartitioned. Origins can write to it from their own contexts on any page. To prevent cross-site tracking of users, data in Shared Storage may only be read in a restricted environment that has carefully constructed output gates. Over time, we hope to design and add additional gates.

We would like to hear what you think about it. Chrome is implementing (available in Chrome Canary on M104) but open to evolving the API over time and are appreciative of your feedback.

chrome status: https://chromestatus.com/feature/6256348582903808

spec: TBD

Thank you, Eric

hober commented 2 years ago

@beidson & @johnwilander, thoughts?

othermaciej commented 2 years ago

Seems this has a dependency on Fenced Frames. Also Fenced Frame Opaque Source. Both things we don't have implemented and not clear if we will.

othermaciej commented 2 years ago

The privacy mitigation against leaking bits of entropy from selectURL() to a cross-site iframe is that the info can only be read by a fenced frame. To uphold privacy properties, this then relies on fenced frame protection against exfiltrating info.

The fenced frame explainer's Privacy Considerations sets a goal of preventing user ID flow into the fenced frame, but doesn't talk as much about preventing flow out. Information channel between fenced frame and other frames mentions a number of communication channels that have to be blocked, but isn't written as a requirement and does not purport to be exhaustive. Fenced Frame Privacy Considerations mentions that fenced frames can navigate the top level or open pop-ups, which is sufficiently powerful to exfiltrate bits gleaned from selectURL to an embedding page (in "some modes" which are presumably documented in the separate fenced frame modes explainer. That is sufficient power to exfiltrate any bits it gets, and to chain multiple calls to selectURL to generate a cross-site ID.

It is difficult to be sure because the set of Fenced Frames explainers is too vague at this time to do a serious privacy analysis, but it seems like bits of entropy from selectURL can in fact be exfiltrated to the containing document, which defats the intended privacy properties of Shared Storage.

It's also concerning in general that this sketch-level explainer lies at the top of a stack of several other sketch-level explainers. Besides making it hard to evaluate privacy and security properties, it makes it hard to verify that what's described can be interoperably implemented, or really review anything else about it.

jkarlin commented 2 years ago

Hi folks. Happy to answer any questions and update documentation to add clarity.

You're correct that selectURL hinges on fenced frames to prevent data exfiltration. Though the private aggregation API, the second output gate of Shared Storage, does not.

For selectURL, the idea is that a fair bit of first party data (a k-anonymous url), and a few (3) bits of cross-site data wind up in a fenced frame. And we want to prevent the data in the fenced frame from leaking out.

From a platform level, we want to prevent the data from leaking out to other parts of the page (or other tabs). We achieve this by forcing the fenced frame into its own browsing context group and storage partition. We're not pretending that we're going to catch every possible communication channel on first blush. This will take time to get right.

On user activation, the fenced frame is allowed to navigate the top frame or open a new tab/window. At that point, via link decoration, it's possible for the data to be conveyed to the destination page, and combined with the destination page's 1p cookies. And this is where we apply rate limiting. On top of requiring a user gesture to perform the navigation, there is a budget of the overall limit to the amount of cross-site data allowed to leak per-shared-storage-origin, per-user, per day. Once exceeded, selectURL will only return the first url in the list.

An alternative, which we're considering but don't have a model that we're yet comfortable with, is to keep the destination page within the fenced partition. Such that it never gets to combine the cross-site information with its persistent 1p data. The tricky thing with this is that the user is in an unexpected browsing state (say they destination page is a shopping site they frequent but now it’s not logged in because it’s in this fenced partition and the user doesn’t understand why).

Finally, we also have a timing attack problem (as does every navigation on the web). That is, the time that the fenced frame is created or resources are fetched can be linked to the time that the embedder created the fenced frame, and the two can collude to share their information. Again, this is not a problem unique to fenced frames, but one that might be more easily addressed in fenced frames due to the fact that they're a new environment with heightened privacy requirements. This can be mitigated if the fenced frame is denied any network access (e.g., loaded via navigable web bundles), or required to retrieve resources from some sort of trusted caching service that promises to only provide logs in aggregate.

I hope this helps, and we're always open for discussions and calls.

othermaciej commented 2 years ago

On user activation, the fenced frame is allowed to navigate the top frame or open a new tab/window. At that point, via link decoration, it's possible for the data to be conveyed to the destination page, and combined with the destination page's 1p cookies. And this is where we apply rate limiting. On top of requiring a user gesture to perform the navigation, there is a budget of the overall limit to the amount of cross-site data allowed to leak per-shared-storage-origin, per-user, per day.

Is a per-day rate limit sufficient? It seems like sites that users visit often would be able to extract an arbitrary amount of data over time.

Once exceeded, selectURL will only return the first url in the list.

Seems like this would potentially leak the fact that the daily limit has been exceeded (for example if the first URL is one that is never normally selected).

jkarlin commented 2 years ago

Hey, thanks for your questions/comments Maciej.

Is a per-day rate limit sufficient? It seems like sites that users visit often would be able to extract an arbitrary amount of data over time.

That is unfortunately the physics of privacy technology. Any budget that resets over time (which seems necessary?) can leak more information over time.

Seems like this would potentially leak the fact that the daily limit has been exceeded (for example if the first URL is one that is never normally selected).

On Fenced Frame click, yes, the fact that the budget was exceeded is leaked to the target site.

othermaciej commented 1 year ago

Also adding the new concerns: dependencies label based on my previous comment that “It's also concerning in general that this sketch-level explainer lies at the top of a stack of several other sketch-level explainers.”

jyasskin commented 1 year ago

Note that Shared Storage and Fenced Frames now have specifications, unlike when you wrote that previous comment. The dependency may still be a concern, of course, and any second implementation is likely to find problems in a specification.