w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
579 stars 50 forks source link

Proposal: API to embed pages in WebExtension bypassing CSP #483

Open OlegWock opened 8 months ago

OlegWock commented 8 months ago

Problem

Some extensions would like to embed third-party sites inside the extension interface (popup / side panel / separate extension page). Currently, this is done using <iframe>, but since most sites restrict embedding them in iframes (using Content-Security-Policy or X-Frame-Options), extensions need to resort to different workarounds to make it work.

Current solution

The common current solution is to use declarativeNetRequest (Manifest V3) or webRequestBlocking (Manifest V2) to intercept requests to the target site and modify (or strip) CSP and X-Frame-Options headers to allow embedding pages in iframes.

This approach, however, has a number of problems:

Ideal solution

The ideal solution should:

  1. Allow extensions to embed third-party sites without explicit permission from the site (no opt-in) and without a way for the site to prohibit embedding it in the extension (no opt-out).
  2. Embedded pages should be top-level, and from the perspective of an embedded page, it should look like the user just opened the site in a browser tab (i.e., no window.top or other links to parent context).
  3. An embedded page should share a storage partition with pages from same origin, as if it was opened in a separate tab. I.e. cookies, localStorage, OPFS and other data specific to origin should be shared.
  4. Existing limitations for extensions should apply:
    • Any registered content scripts matching the URL in the frame should be injected there.
    • If extension doesn’t have host permission for the page in frame, it shouldn’t be able to interact with the page or exfiltrate any data from it (e.g. redirect URL).

Possible solutions

There are a few novel embedding techniques (in addition to existing iframe). Unfortunately, neither of them fully satisfies criterias from previous section.

Fenced Frame

Fenced frames allow embedding cross-origin pages and enforce boundaries between embedder and embedded contexts. However, to be displayed in a fenced frame, the site needs to opt in by providing Supports-Loading-Mode: fenced-frame header in the response. And because of this, a fenced frame doesn’t solve the original problem, as it lets site owners control if their site can be displayed in a fenced frame, and I expect most of the sites will prohibit it.

Controlled Frame

Controlled frame is a feature of Isolated Web Apps (IWA) that allows developers to embed a page into their app, bypassing the CSP or X-Frame-Options of the embedded site. However, there are a few important features of controlled frames that aren’t acceptable in a web extension context.

IWA can access and monitor user actions in a controlled frame: extract cookies from the frame, observe activity like keyboard events, etc. This is acceptable for IWA since a site open in a controlled frame gets its own storage partition, so a page in a controlled frame (and so IWA) won’t get access to cookies and other data associated with origin as if it was open in a separate browser tab.

This doesn’t work well for web extensions, as extensions are more integrated into the browser UI than IWAs and users expect to see the same version of the site in the extension and in a separate tab. Having separate storage partitions makes it impossible.

And even if controlled frame in extensions will still use a separate partition, allowing extensions to control frames for hosts they don’t have permissions for will also go against the current security model for web extensions.

Portals

Portals are intended for the pre-rendering of content and are even more restrictive than iframes: users can’t interact with content, the embedded page can’t access any API that requires permissions, etc.

Webview

Webview (docs 1, 2) is a special tag that is available for Chrome Apps (which are now deprecated) and allows embedding of third-party sites inside the app. Embedder can control embedded page to great extend: inject CSS/JS, listen to events like console messages or manipulate history. Embedder is also able to control which partition webview should use to store user data.

Taking into consideration that it's already implemented in Chrome and closely matches our requirements, webview looks like most realistic of current solutions. Though it's important to note, that Chrome Apps required separate webview permission to use this element. For extensions this behavior should be altered: using webview shouldn't require separate permission, but to have access to site inside webview, extension should have respective host permission.


So far webview seems to be the closest to ideal, however it still requires modifications to work well in extension's context. With this info I'd like to propose 4 different solutions for discussion.

  1. Making <webview> available for extensions and altering its behavior to better match extension's security model.
  2. Altering behavior of controlled frame if rendered in extension's context: use same storage partiotion and limit extension's access to embedded page
  3. Adding new extension-only attribute to iframe which if present will alter iframe behavior so it behaves more like browser tab (bypassing frame-ancestor CSP, all cookies are sent to page, no window.top, etc)
  4. Adding new HTML element <isolatedframe> which will behave as described in 'Ideal solution' section and will be available only for extensions.
dotproto commented 8 months ago

Thanks for opening this. We've discussed the possibility of giving developers more control over CSP in declarativeNetRequest, but as I recall Chrome objected to this because CSP is not only controlled by request headers; it can also be set in META tags inside a document's response body. In addition, there were other challenges associated with embedding another page in an iframe that you've already called out. At the time, a suggestion was made to open a new issue to track the more generic request to improve embedding other web pages inside an extension page, but I don't think we actually field such an issue.

  • declarativeNetRequest doesn’t intercept responses served by the service worker; those responses will contain the original CSP header. The only way for the extension to work around this is to remove any registered service workers for the website (using browsingData API) each time before embedding it into an iframe.

This is also true of webRequest. Both APIs were designed as an abstraction over the browser's network layer rather than as a generic interception mechanism for requested initiated by a page or worker context.


I'd like to add another possible solution to the list

WebViews

The Chrome App platform introduced the concept of a <webview> tag "to actively load live content from the web over the network and embed it in your Chrome App" (docs). This tag is use in chrome:// pages and in Chrome Apps (deprecated). This element is also exposed in Electron (docs). See this Igalia blog post for additional notes about the tag.

There was some discussion about bringing the webview tag to Extensions in issue 422805. This issue was closed as WontFix in 2015.

hanguokai commented 8 months ago

Just add another discussion link. We discussed this problem a few months ago, including webview, Controlled Frame and Fenced Frames.

oliverdunk commented 8 months ago

Thanks so much for filing this @OlegWock. It is by far the most comprehensive summary of the situation that I've seen! I suspect we are going to need quite a bit of discussion here but this is definitely a good starting point.

OlegWock commented 8 months ago

@dotproto thanks, I updated original comment to include info about webview (& fixed some typos)

I didn't know about <webview> but it looks really promising

dotproto commented 7 months ago

Thinking out loud, what if we tweaked how browsers handle CSP and X-Frame-Options? If an extension has host permissions for a site and a specific (new?) permission, we could treat top-level browsing contexts on the extensions origin as an allowed ancestor for that site. For CSP, perhaps the extension's origin could be implicitly included in the ancestor-source-list for each frame-ancestors directive in the page's CSP list. For X-Frame-Options, perhaps we could simply bypass the frame ancestor check.

OlegWock commented 7 months ago

How would this work with frame busters? Currently, even if CSP is patched, page can figure out if it's displayed in iframe and refuse to load, for example.

And if I understand correctly, with this approach, browser won't pass SameSite=Lax/Strict cookies to the site?

oliverdunk commented 7 months ago

How would this work with frame busters? Currently, even if CSP is patched, page can figure out if it's displayed in iframe and refuse to load, for example.

An open question we have is what the scope of this change should be. Bypassing CSP and X-Frame-Options works in most cases where a site is just trying to avoid other sites embedding it, and would be significantly easier than creating a true frame-buster proof mechanism. While I can see the value in both it may be worth pursuing the former first since something is better than nothing.

And if I understand correctly, with this approach, browser won't pass SameSite=Lax/Strict cookies to the site?

I think this could be solved separately. There are similar concerns with third-party cookie deprecation that we may be able to solve with host permissions (see where this link goes and the storage section in the same doc): https://developer.chrome.com/docs/extensions/mv3/storage-and-cookies/#cookies-partitioning

oliverdunk commented 3 months ago

We discussed this at our in-person meeting in San Diego. There are two related goals:

  1. The ability to embed a website which does not wanted to embedded in general, but is not actively trying to block extensions.
  2. The ability to embed a website such that it cannot tell it is being embedded.

We agreed that it is desirable to solve both but that (1) is much easier than (2), and would solve a significant number of cases we have heard from developers in a much shorter time frame, so we'd like to start with that.

Our preferred approach for this is to simply ignore restrictions like CSP, X-Frame-Options and COEP. This would require host permissions, and initially we will only implement this if the iframe is in a top-level extension frame to avoid possible attacks that involve an extension page being embedded in a third-party site (even with an extension page and a.com > extension page -> b.com, a.com can navigate b.com unexpectedly). We think it makes sense to have this behavior without needing to opt-in for simplicity.

I'm going to take the action item of writing a more formal proposal for this.

yankovichv commented 3 months ago

It seems like today is Christmas :)

The first option is very good. It will solve most of the difficulties.

However, will there be problems with cookies because the site frame is in the extension frame?

However, the clever Spotify developers make me ask another question: are there any plans for a second option? Will it develop, or most likely not?

oliverdunk commented 3 months ago

@yankovichv, glad this sounds good 🎉

However, will there be problems with cookies because the site frame is in the extension frame?

At least in Chrome, frames rendered immediately below a top-level chrome-extension:// page always get cookies as though they were the top frame. There's some more on this here.

However, the clever Spotify developers make me ask another question: are there any plans for a second option? Will it develop, or most likely not?

Everyone seemed supportive. We also all agreed it was a lot trickier though, so I'm not sure if anything will happen in the short term.

hanguokai commented 2 months ago

I stumbled across a Chromium source code comment and a bug.

// Extensions can load their own internal content into the document. They
// shouldn't be blocked by the document's CSP.
//
// There is an exception: CSP:frame-ancestors. This one is not about allowing a
// document to embed other resources. This is about being embedded. As such
// this shouldn't be bypassed. A document should be able to deny being embedded
// inside an extension.
// See https://crbug.com/1115590
bool ShouldBypassContentSecurityPolicy() {}

So you are suggesting to apply CSP for frame-ancestor no matter the extension?

Yes.

Mike suggested to do it only if the extension have access to the embedded document. I am not sure to see what it really means. Is there any meaningful context were we can say that?

Yes. There is a concept of host permissions which an extension can specify and a user can modify (https://developer.chrome.com/extensions/runtime_host_permissions). IIUC the suggestion was for the extension to be able to embed a frame if it had access to a frame regardless of X-Frame-Options/frame-ancestors. I think this is a good thing to do given that:

  • There are legit use cases for an extension to be able to embed a frame and if it has host permissions to a frame then its reasonable to bypass the frame-src and X-Frame-Options restriction.
  • Currently if an extension has access to a page, it can already modify its X-Frame-Options and CSP header using the web request API to make this possible. However this is not ideal and in Manifest V3, we are hoping to prevent the extension from relaxing the CSP. If we automatically allow the extension to embed frames to which it has permission, it won't need to modify these headers.

In my understanding, this is consistent with what Oliver said earlier. At present (after that bug was fixed), extensions do not allow bypassing web page's X-Frame-Options/frame-ancestors. But ideally, if the extension has the page's host permission, embedding iframe should be automatically allowed. This can prevent developers from forcefully relaxing CSP.

dscham commented 2 months ago

Hi, I stumbled on this discussion because CSP and X-Frame-Options just stopped my idea in it's tracks.

I get why those are useful for normal web use. I even get why they should work for extensions. On the other hand, since you could bypass them anyway, by changing the response in a worker, I don't see why to obey them at all in an extension. Of course, you could see it the other way around and block being able to change those at all. But that also limits some use cases.

Mine is that I have hundreds of tabs. Some, which I probably won't need anymore. So I wanted to gamify cleaning them up. And for that, I wanted to preview the tabs contents in my extension. By opening a tabs URL in an iframe, one tab at a time.

And I know, I'm not the only person that uses their tabs as reminders or notes, of what to look into. But at some point realize, they have a lot of tabs they don't really have an overview of anymore and need to clean them up in a fun way.

I guess I'll have to bury that idea for now.