WICG / first-party-sets

https://wicg.github.io/first-party-sets/
286 stars 75 forks source link

cache partitioning impacts perf optimizations that FPS might help recover #35

Open erik-anderson opened 3 years ago

erik-anderson commented 3 years ago

After Edge and Chrome have started to deploy HTTP cache partitioning, the Edge team was contacted by the SharePoint team about a perf optimization they've had that cache partitioning broke.

The scenario is roughly this:

Imagine a folder viewer for a cloud storage drive. It contains Word, Excel, and PowerPoint documents which can be previewed directly from the same page. To do that, they load the viewer for those apps in an iframe.

To speed up the viewer load, they kick off a download to a JS file that they know the iframe will turn around and load (basically think of it as a prefetch). The time between kicking off that download and loading up the iframe is very small.

The domains involved here would be a top-level one like https://microsoft.sharepoint.com and an example of an iframe URL is https://excel.officeapps.live.com. The resource itself would be https://some.cdn.office.net/path/to/script.js.

Since the cache is double keyed now, the top-level page fetching the script doesn't provide any benefit for the iframe which has a different partitioning key.

How FPS might help:

If they could use First-Party Sets to call out that the SharePoint and the Excel origins are part of the same FPS and share the same partitioning key, they would recover this optimization.

There might be some interesting challenge where the registerable domain restriction might be too specific. All of those are owned by Microsoft, but perhaps an individual owner like "the Office team" would want to configure it for sharepoint.com+officeapps.live.com while some other part of Microsoft might want to have the rest of live.com in some other FPS. Perhaps the fix there is to have them change the domains their endpoints are on so there's clearer sub-org ownership. Or maybe there's some model where subdomains of registerable domains can be part of a set that we should explore.

Resource Hints are one potential way for them to recover some of their optimization, but it still wouldn't address/recover the additional roundtrip.

I'm opening this issue in the hopes of discussing further about how FPS might help this scenario.

othermaciej commented 3 years ago

I don't think First Party Sets is necessary for this. The behavior described sounds like triple-keying rather than double-keying; otherwise, the page and a frame it embeds would both be getting resources from the same cache partition.

erik-anderson commented 3 years ago

The way Chromium has implemented cache partitioning is described in https://developers.google.com/web/updates/2020/10/http-cache-partitioning.

It's keying based on top-level origin + iframe origin.

The relevant example from that article:

Cache Key: { https://a.example, https://a.example, https://x.example/doge.png } Now the user comes back to https://a.example but this time the image (https://x.example/doge.png) is embedded in an iframe. In this case, the key is a tuple containing https://a.example, https://a.example, and https://x.example/doge.png and a cache hit occurs. (Note that when the top-level site and the iframe are the same site, the resource cached with the top-level frame can be used.

Cache Key: { https://a.example, https://c.example, https://x.example/doge.png } The user is back at https://a.example but this time the image is hosted in an iframe from https://c.example.

In this case, the image is downloaded from the network because there is no resource in the cache that matches the key consisting of https://a.example, https://c.example, and https://x.example/doge.png.

I realize that Safari uses only top-level eTLD+1. Perhaps there's a mismatch in what folks mean when they say double-keying.

I agree that, with Safari's current approach, this perf issue would not exist.

annevk commented 3 years ago

This seems not advisable as partitioning is a security boundary and helps to defeat attacks outlined at https://xsleaks.dev/. (I still don't think FPS should exist, but thought I should note this since it was agenda+'d.)

krgovind commented 3 years ago

@annevk Apologies for being a bit pedantic; but I'm trying to tease apart any nuances that I may be missing: The fact that the partition key uses top-frame and current frame "site" and not "origin" makes me think that FPS could be a reasonable replacement? (Since the main premise of FPS is that "site" currently relies on "registrable domain" which is an outdated definition based on the DNS).

Isn't "origin" the agreed upon "security boundary"?

annevk commented 3 years ago

I think the idea that FPS could replace the current meaning of site is misguided. Site as it is defined today is very much a security boundary in its own right (it's what we key agent clusters on, see HTML) and serves as a process boundary in Chrome and soon Firefox. Concretely, you would not want an XSS in youtube.com to be able to (side channel) read accounts.google.com.

davidben commented 3 years ago

FPS should not blanket replace every use of "site" in the platform. @annevk is right that sites are security boundaries elsewhere in the platform. E.g. the process allocation business is a consequence of how document.domain behaves. FPS should leave that alone.

bslassey commented 3 years ago

I think the better way to think of FPS is retaining existing properties of 3rd parties (e.g. cross domain cookie access and shared caching of resources) within a set of domains as we reduce the capabilities of 3rd parties in general (e.g. blocking third party cookies and partitioning the various caches by first party).

erik-anderson commented 3 years ago

The discussion this issue was intended to focus on is very specific to if FPS should impact cache partitioning.

It wasn't intended to suggest modifying any existing security boundaries (or semi-boundaries) that existed before browsers started shipping keyed cache partitions.

For the specific customer that prompted this issue, it's specific to double keyed cache partitioning (top-level origin + frame origin), though it's worth discussing the more general top-level-origin keying case as well given the broader "common resources living on a CDN" use case.

annevk commented 3 years ago

Right, I'm saying that cache partitioning is a security boundary.

erik-anderson commented 3 years ago

We discussed this issue during today's Privacy CG call.

Information leaks, accidental or otherwise, are a primary concern driving partitioning. Much of it is around privacy, but some of it is security as well.

As the discussion today covered, there are many other APIs with similar concerns beyond cache that aren't effectively isolated within iframes, so it's not a current priority for Mozilla to add the additional level of keying that's in Chromium though it's likely to be desirable.

The primary thing I would like to understand better w.r.t. this specific issue is how strong the concerns apply when both the top-level window and the embedded frame are explicitly a part of the same FPS. The two sites could presumably choose to explicitly pass context across via whatever cross-site messaging flows get unblocked by FPS (e.g. cookies with a specific attribute or something else) in addition to using existing postMessage flows available to embedding scenarios. Is the outstanding concern, then, that having it leverage a shared cache key can have subtle implications via the side-channel aspect that site developers are unlikely to sufficiently understand?

If FPS (or, perhaps something outside of FPS, e.g. some CORS-like solution) offered a more explicit pattern where the sites could proactively agree to some URL patterns to share cache between them (with the possible scoping of it to "share by using the top-level site's key" rather than "fully share across the two top-level sites"), would that significantly address the concerns?

To map that high-level thought to my original example, if the FPS definition for the sites could include something that says "we desire that all URLs under https://some.cdn.office.net (which may or may not be part of the FPS) get shared", then when a URL is fetched within the subframe context of https://excel.officeapps.live.com, the browser might choose to alter the cache partitioning key from https://microsoft.sharepoint.com+https://excel.officeapps.live.com+https://some.cdn.office.net/resource.js to https://microsoft.sharepoint.com+https://some.cdn.office.net/resource.js (when the top-level URL is on https://microsoft.sharepoint.com).

I'm not sure the complexity of such an approach would actually be warranted, but I would like to get some clarity on what folks consider to be a reasonable solution space in an environment where we're worried about side channel data leaks between various frames loaded under the context of a single site.

krgovind commented 3 years ago

Answering @erik-anderson's question about what could be a reasonable solution space (not speaking to the solution proposed above): My understanding is that an explicit pattern would indeed alleviate the security concerns around allowing same-party, cross-domain sharing of the cache.

chrisn commented 3 years ago

We would be interested in potential solutions for the "common resources living on a CDN" use case.

domenic commented 3 years ago

I hope it is OK to discuss non-FPS-related solutions in this thread. But the OP's situation sounds like exactly the sort of thing we're trying to accomodate over in https://github.com/jeremyroman/alternate-loading-modes : providing a privacy- and partition-preserving mechanism for prefetching (and prerendering) content.

We've spent most of our time thinking about whole documents and their subresources, which might not make as much as sense in the subresource-focused situation described here. But I suspect many of the mechanisms and underlying spec could be used even for subresources.

Some more details To give a high-level flavor of our current thinking for whole-documents: the prerender or prefetch would be done without any credentials, and would put all its resources into a separate "speculative" HTTP cache partition, e.g. with key `{ https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = true }`. Upon activation, i.e. upon transitioning of the prerendered page on `https://excel.officeapps.live.com` to being a user-visible top-level browsing context, the `{ https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = true }` partition would get "merged" into the usual `{ https://excel.officeapps.live.com, https://some.cdn.office.net/, speculative = false }` partition. "Merged" here is under active discussion, and might look more like a fallback or memory cache or something, but the basic idea is to allow use of the speculative resources upon activation, and throw them out otherwise.

Since this is the FPS repo, probably we shouldn't dig too deep into the details of a prerender/prefetch mechanism here, but please feel free to open an issue at https://github.com/jeremyroman/alternate-loading-modes if you think this might be worth exploring...

mostafalarki1368 commented 6 months ago

پس از اینکه اج و کروم شروع به استقرار پارتیشن بندی کش HTTP کردند، تیم شیرپوینت با تیم Edge در مورد بهینه سازی عملکردی که پارتیشن بندی کش شکسته شده بود، تماس گرفت.

سناریو تقریباً به این صورت است:

یک نمایشگر پوشه برای درایو ذخیره سازی ابری تصور کنید. این شامل اسناد ورد، اکسل و پاورپوینت است که می توان آنها را مستقیماً از همان صفحه پیش نمایش کرد. برای انجام این کار، آنها نمایشگر آن برنامه ها را در یک iframe بارگذاری می کنند.

برای سرعت بخشیدن به بارگیری بیننده، آنها یک فایل JS را دانلود می کنند که می دانند iframe آن را برگردانده و بارگذاری می کند (اصولاً آن را به عنوان یک پیش واکشی در نظر بگیرید). فاصله زمانی بین شروع آن دانلود و بارگذاری iframe بسیار کم است.

دامنه های درگیر در اینجا یک دامنه سطح بالا مانند https://microsoft.sharepoint.com و نمونه ای از URL iframe https://excel.officeapps.live.com است . خود منبع https://some.cdn.office.net/path/to/script.js خواهد بود .

از آنجایی که کش در حال حاضر دو کلید شده است، صفحه سطح بالایی که اسکریپت را واکشی می کند هیچ مزیتی برای iframe که دارای یک کلید پارتیشن بندی متفاوت است، ندارد.

چگونه FPS ممکن است کمک کند:

اگر آنها می توانستند از First-Party Sets استفاده کنند تا اعلام کنند که شیرپوینت و ریشه های اکسل بخشی از یک FPS هستند و کلید پارتیشن بندی یکسانی را به اشتراک می گذارند، این بهینه سازی را بازیابی می کردند.

ممکن است چالش جالبی وجود داشته باشد که محدودیت دامنه قابل ثبت ممکن است خیلی خاص باشد. همه آنها متعلق به مایکروسافت هستند، اما شاید مالک فردی مانند "تیم آفیس" بخواهد آن را برای sharepoint.com+officeapps.live.com پیکربندی کند در حالی که برخی از بخش های دیگر مایکروسافت ممکن است بخواهند بقیه live.com را داشته باشند. در برخی از FPS های دیگر شاید راه حل این باشد که آنها دامنه هایی را که نقاط پایانی آنها در آن قرار دارد تغییر دهند تا مالکیت سازمان فرعی واضح تری وجود داشته باشد. یا شاید مدلی وجود داشته باشد که در آن زیر دامنه‌های دامنه‌های قابل ثبت می‌توانند بخشی از مجموعه‌ای باشند که باید بررسی کنیم.

Resource Hints یکی از راه های بالقوه آنها برای بازیابی بخشی از بهینه سازی خود است، اما باز هم به مسیر رفت و برگشت اضافی پاسخ نمی دهد/بازیابی نمی کند.

من این موضوع را باز می کنم به این امید که بیشتر در مورد اینکه چگونه FPS ممکن است به این سناریو کمک کند بحث کنم.

Angelie1528 commented 2 months ago

Online slots

Angelie1528 commented 2 months ago

Online casino