Open therealglazou opened 3 years ago
Side note: even if <embed>
is deprecated, some sites still use it (yeah, fools...). Example: https://www2.rxnt.com/phr/
Getting the frameID for a document embedded through <embed>
is even more complicated because it's not possible to postMessage()
the inner window.
… it's even too complicated to retrieve a frameID at all since the only existing way to do that ATM is …
Another way to do this is to use the webNavigation.getAllFrames(), but this requires the webNavigation permission. Offhand I'm not sure what permission prompt other browsers display for this, but Chrome warns users that this permission allows extensions to "Read your browsing history."
Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using this data.
Exposing a way for the host frame to get it's own frame ID sounds reasonable. We may also want to investigate a more ergonomic way of getting all frame IDs for a given tab than the current webNavigation approach, as that grants access to much more data than necessary to, say, properly target script injection for a given frame.
Can you share a bit more about why the content script needs to know the ID of its host frame? As opposed to, say, the background script using thi data.
We're a Password Manager. When a form contains iframes themselves containing form fields (well-known example visible at https://account.protonmail.com/signup?language=en but many banks do that too), we need to "replace" an iframe in such a form by its inner form fields when we "recognize" forms at the outermost level. Reaching the frameElement is often impossible and a direct postMessage() to the iframe's window can be unreliable if the contents of the iframe rely themselves on messages and do not accept easily third-party messages (well-known example is Office365, ahem...). So one of the only reliable things to associate an iframe element and its inner contents is the frameID. All Password Managers share the same burden.
As of today, of course, we ping/pong the background to get the frameID but that feels like a hack: in short you send a dummy message to the background to grab the originating frameID from the message and send it back. getAllFrames()
is hardly usable because of the permission constraints it adds on our webextension. Maybe getAllFrames()
should not live inside webNavigation
. And even if you can capture the frameID that way, you can capture from within the iframe, it's still not directly associated to the iframe element at the upper level...
100% agree with this suggestion, this change would make it easier to implement secure messaging between content scripts in cross domain iframe environments. It is currently very difficult with the messaging hack mentioned above being one of the ways to do it.
We face exactly the same problem at 1Password, working on our own password manager. Some sites split username and password fields in login forms across frames, and for the purposes of filling, we need a reliable way of knowing where each is located.
Additionally, some of the UI we display is added to the DOM of the top frame, but positioned based on the location of a field within an iframe. For this, we need to add the offset of the field within the frame to the offset of the frame itself in the page. Doing so requires that we identify the particular iframe we are showing UI for. Workarounds include comparing the size of an iframe with the size of the window within it. Ideally, however, this would be as simple as using the frame ID that sent us the message.
As of today, of course, we ping/pong the background to get the frameID
Dark Reader does this too.
Maybe
getAllFrames()
should not live insidewebNavigation
.
It would be great solution. Perhaps, it can live under tabs
? Also, "tabs"
permission could gate some information (like URL)like it already does Tab.url
, Tab.title
, and Tab.faviconUrl
.
I tried to design something to address all mentioned use cases, but still keep it minimal to enable quick and easy adoption by other implementations. My assumptions are:
frameId
of the current frame.frameId
for all descendant <iframes>
.frameId
for parent and ancestor frames.webNavigation
namespace.So here's the proposal:
browser.runtime.getFrameId(target: WindowProxy | HTMLIFrameElement): number
This returns the frameId
from the passed Window or container element, or for the current (caller) frame if omitted. Since the Browsing Context subtree already needs to be available in the process, it should be possible to make this a sync method in all implementations.
You can pass the <iframe>
directly if you're interested in positioning inside the page, window.parent.parent...
to get all ancestor frames, or any descendant WindowProxy
from the postMessage()
event.source
.
One potential issue is that AFAIKT this would be the first extension API that receives a DOM element as an argument. I had to work around a minor issue (for cross-origin values), and it might be a larger issue in other engines.
I've put together a quick prototype, you can test it yourself if you're able to build Firefox locally.
(That patch also includes an alternative design for getFrameInfo
I was considering, which enables enumerating all child frames, but it turned out overall more complex and less capable, and it's not clear that's a real use case).
- (probably nice to have) Get
frameId
for parent and ancestor frames.
Not only "probably nice to have" but absolutely necessary. Let me explain....
"*"
) to postMessage()
a parent frame because otherwise the messages will always trigger an origin error.postMessage()
to communicate between frames is totally unusable in WebExtensions.<frame>
or <iframe>
element to a frameId goes through finding the index of a frame's window into its window.parent.frames
array. BUT window.frames
and document.querySelectorAll('frame,iframe')
are NOT in sync, contrarily to what developer docs are saying. Frame windows seem to be added to that array at creation time while the other list is in tree traversal order.... It's then completely impossible to reliably use that path.sendMessage()
is reliable between frames inside WebExtension code, postMessage()
just cannot be used, it breaks websites; 2. it's crucial for each frame to have its frameId and its parentFrameId (communication upwards); 3. it's crucial for extension code to be able, for any arbitrary <frame>
or <iframe>
element, to get its frameId (communication downwards)This is doable but only through webNavigation
permission (that many extensions are reluctant to ask for given the scary message browsers show) or non-blocking webRequest
. In both cases, that's hacky and complicated for something that should be trivial. And we have absolutely nothing to fix the conclusion3 issue ATM.
you can test it yourself if you're able to build Firefox locally.
I forgot you can actually download builds from our CI infra for Linux, OSX or Windows -- though I only tested the last one, please ping me if something doesn't work for you.
Not only "probably nice to have" but absolutely necessary. Let me explain....
Thank you for the clarification.
I didn't notice that requirement mentioned in above use cases, but while trying to work through them, I realized it's probably important, so thanks for confirming.
browser.runtime.getFrameId(target?: WindowProxy | HTMLIFrameElement): number
That would perfectly fit our needs. Thanks a lot @zombie !!!
This works for the Safari team.
One point of feedback on the API design.
The parameter being a mixed type and potentially optional may be problematic. Consider the following example:
let frameId = chrome.runtime.getFrameId(window.opener);
The intention behind this code is to get the frameId of the window that opened this page.
However, window.opener
can be null
, in which case the API would return the current frame.
This unexpected result can potentially cause the extension to use the wrong frameId, and therefore the wrong frame/window.
This could be resolved by disambiguating the parameters, by being explicit about which frame is meant.
For example by always requiring a parameter instead of null
(i.e. getFrameId(self)
aka getFrameId(window)
to get the current frame's windowId).
Or by being even more explicit, taking a dictionary that does not have "unsafe" fallbacks (i.e. overlap between optional parameters and different results). For example:
// Get frameId of current window.
let frameId = chrome.runtime.getFrameId({
type: "self",
});
let frameId = chrome.runtime.getFrameId({
type: "windowProxy",
windowProxy: window, // or parent / opener / top / frames[x] / iframe.contentWindow / event.source, ...
});
let frameId = chrome.runtime.getFrameId({
type: "htmlElement",
htmlElement: iframe, // One of: <iframe> / <frame> / <embed> / <object>
});
I wonder if the dual type argument, WindowProxy | HTMLIFrameElement
, is really necessary. Since chrome.runtime.getFrameId(frameElement)
and chrome.runtime.getFrameId(frameElement.contentWindow)
should be strictly equivalent, maybe WindowProxy
is enough.
I wonder if the dual type argument,
WindowProxy | HTMLIFrameElement
, is really necessary. Sincechrome.runtime.getFrameId(frameElement)
andchrome.runtime.getFrameId(frameElement.contentWindow)
should be strictly equivalent, maybeWindowProxy
is enough.
As mentioned in the example in my last comment, there are more ways to embed a document other than <iframe>
. For frames (<iframe>
and <frame>
), it's indeed possible to get the window with iframe.contentWindow
. It's even possible to do the same for <object>
. For <embed>
, that's not possible. Interestingly, documents from embeds do appear in the window.frames
object.
<embed>
used to be used to embed plugins in web pages. Plugins can expose a scripting interface, so up until a few years ago, it was not really safe to expose more properties. Now that support for plugins have been dropped from browsers, it may be feasible to add properties such as contentWindow
and contentDocument
to <embed>
, which would indeed make the htmlElement
kind of parameter redundant. I filed an issue with a request to add .contentWindow
to <embed>
at https://github.com/whatwg/html/issues/7140
If that issue gets resolved, then getFrameId
only needs to take a window
.
Ok, thanks a lot @Rob--W .
For example by always requiring a parameter instead of
null
(i.e.getFrameId(self)
akagetFrameId(window)
to get the current frame's windowId).
Good callout, the <no params to get current frame's id> is a leftover from when the method only worked with EmbedderElements
, and since you suggested including WindowProxy
, it's not really needed anymore (I've edited the original proposal to make it clearer).
@dotproto Can you please check with the Chrome team if this would present implementation difficulties, or if they have any other feedback to this proposal?
This discussion sounds similar to what has been done for closed shadow roots. Closed shadow roots are meant to be inaccessible from main world, and it has been requested to allow extensions to access them. It is now available on chrome.dom.openOrClosedShadowRoot
from Chrome's side, and Element.openOrClosedShadowRoot
from Firefox's side.
I mostly agree with spec writers' rationale for not wanting to expose contentWindow
on HTMLEmbedElement
. I propose that browser venders should implement a corresponding chrome.dom.contentWindow
/ HTMLEmbedElement.contentWindow
API to allow content scripts to access content window of <embed>
nodes.
Previous proposals add unnecessary overhead to the API call just for one edge case: Making API polymorphic will likely require browser's wrapper code to introduce a type check, and requiring consumers to pass a one-off dictionary seems bad for garbage collection.
Moreover, once we have an API for accessing contentWindow
of <embed>
nodes, developers may find other good uses of it.
For completeness's sake, below is one possible API design that I just came up with.
chrome.dom.contentWindow(element:HTMLIFrameElement | HTMLFrameElement |
HTMLObjectElement | HTMLEmbedElement): Window | null
interface HTMLEmbedElement {
contentWindow: Window | null
}
For extending existing HTMLEmbedElement
, one may consider using a different property name to signal developers that this is an extension-only API.
As mentioned here: https://github.com/w3c/webextensions/issues/77#issuecomment-938053187
@dotproto created a crbug for the proposal here: https://bugs.chromium.org/p/chromium/issues/detail?id=1256872
Chiming in from the Chromium side.
I was able to bring this up with our security team, and there were a few different considerations. First off, to explain a few of our primary considerations:
First, we often try to assume that almost any renderer process can be compromised (and, it turns out, that this is pretty much true : )). Because of these, we try to limit the amount of information and capabilities we give to renderer processes. In particular, it ideally shouldn’t be possible for a compromised renderer process to access or affect other renderer processes.
Second, the boundary between content scripts and the main page is also weaker, since it’s only a JS context boundary (and not a hard process boundary). In either the case of a compromised renderer or just a “pierced” boundary between JS contexts, an attacker can potentially gain access to any power or knowledge that a content script has.
These are the reasons that the vast majority of extension APIs aren’t accessible from a content script context and that we’ve adjusted the cross-origin fetching capabilities of content scripts, and that we are generally opposed to expanding those capabilities.
In this case, though, it’s a bit of a blurred line. For one thing, I can absolutely see the benefit and desire here for extension developers - dealing with frame IDs is painful, and if there’s frame-specific logic you need in content scripts, it’s even more so (and sometimes impractical, if you need it synchronously for a script running at document_start). There's also an argument that, at least for retrieving the current frame's ID, it's not really an "extra" capability (though this is somewhat inaccurate for Chromium; discussed below). From a platform viewpoint, I’m supportive of exposing this information to ease development.
Now, though, we get into the nitty gritty. In Chromium, frame IDs (apart from the main frame, which is always 0) are currently associated with something called the Frame Tree Node ID. This a globally-unique, monotonically-increasing identifier given to a frame. If we expose this information to a compromised renderer, it can give away information about the current state of the browser, as well as being able to detect happenings in other tabs (by adding new frames and seeing the delta from the previous ones). This type of information can then help attackers mount timing attacks, among others. A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.
It’s true that these are largely only security concerns when one (or frequently, more than one) thing already goes wrong, but we very much strive to have defense-in-depth. We’ve historically seen many vulnerabilities that chain together multiple things going wrong.
Now, for the path forward. On the Chromium side, I think many of these concerns would be assuaged if we had an unguessable identifier associated with frame IDs. I think this is an approach that we could take (I’ll investigate it), but we’d likely block runtime.getFrameId() on that being completed. Even then, there could be a slight (though significantly mitigated) concern around allowing frames to retrieve the unique identifier of other frames in the hierarchy, especially ones that might be cross-origin and/or cross-process; I’m continuing the discussion there to see if, with an extension-specific identifier here, that’s a reasonably low risk.
I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?
Thank you Devlin for the Chromium perspective. I think your proposal for using random frame IDs is compatible with the getFrameId()
design, and something we could implement if/when we decide to (though it might not be a priority for us, see notes below for mitigating circumstances in Gecko).
Since additional exposure seems marginal (at least for Firefox), and using random IDs would be an implementation-level change transparent to extension, it doesn't look like a blocker for the proposed design. I think we can proceed with shipping this as an experimental API to get it into the hands of developers, to test it and provide any potential feedback.
A second risk here is that if an attacker has control over a process and the knowledge of the frame tree node IDs within it, that attacker could try and leverage an inter-process message to the browser targeting different frame tree node IDs that are likely to exist.
I'm somewhat confused by this. My presumption based on talking to Gecko's site isolation team is that these frame node IDs (or Browsing Context IDs I mentioned in the proposal above) would already be available in each render process, since we already need to support WindowProxy
references to each frame in the same subtree (or perhaps even the whole Browsing Context Group), including cross-origin ones. In case of a compromised render process you mention, these IDs would already be in the process's memory and available to the attacker.
Of course this could be mitigated by parent process generating a per-process mapping of frame node IDs, or maybe with something similar to what you're proposing about randomized IDs exposed to extensions, and Chromium might already be doing something along those lines, but at least Gecko isn't (perhaps partially because of the second point below).
I’m curious for the other browsers’ implementations here - are frame IDs (apart from the main frame) already random? Or are any of these also concerns for other implementations?
They are not random in Gecko either, but they're (mostly) monotonically-increasing only per-process. At least for the case you described, a compromised child process wouldn't be able to observe the creation of frames initiated from other processes.
Quick update, we landed this in Firefox 96, which is currently in beta or dev edition, and will be released on January 11th.
Not documented on MDN yet, in case we need to adjust the design based on feedback from developers or other implementations, so please try it now and report any issues or bugs.
And for clarity, since there's no documentation, this is the implemented API:
browser.runtime.getFrameId(target: WindowProxy | HTMLIFrameElement): number
Should be fairly obvious, returns the frameId for any passed Window
reference or an embedder element, and throws if passed anything else.
Quick update, we landed this in Firefox 96, which is currently in beta or dev edition, and will be released on January 11th.
Not documented on MDN yet, in case we need to adjust the design based on feedback from developers or other implementations, so please try it now and report any issues or bugs.
This is wonderful ! Thanks a lot @zombie, we'll make sure to test it ASAP.
Chrome, the ball is now in your hands. 😄
Hey @zombie! We jumped on this at 1Password and I'm happy to share that I was able to use the new API as a drop-in replacement for our existing matching logic. That involved both getting the id of a window
as well as an iframe
from the parent frame. I don't think I have any extra feedback at the moment - a huge thanks for implementing this and we look forward to it reaching other browsers.
@zombie we tested it too at Dashlane inside our browser extension and it works absolutely fine! Thanks a zillion again and we're waiting for Chrome and Safari implementations now.
@dotproto As a follow up to the previous WECG meeting, I intended to ask if Chrome's position has changed about the security implications of this proposal (and Mozilla's proposed implementation), and also in reference to devlin's comment earlier (linked). It would be nice to have this feature in all major browsers. Thanks!
+1
This would be great to see in MV3, supported across all browsers.
We need to check the allow policy set on an <iframe allow="monetization">
to determine if a request initiated in the content script of an iframe is allowed.
We are currently using webNavigation
apis and a bunch of messaging of correlation ids between content scripts and the background page to be able to get the id of an iframe. It's complicated enough that I can't recall it all offhand now a few years after it was written.
Could this be expanded to include the tab ID as well?
browser.runtime.getTarget(window) // {tabId: 1, frameId: 0}
browser.runtime.getTarget(iframe) // {tabId: 1, frameId: 63748}
It's a pretty common request to know "this" tab ID: https://stackoverflow.com/q/6202953/288906 (123 upvotes)
Currently this requires a ping to the runtime, which may or may not be "quick".
Including the recently-added documentId
would be great. There's a lot of potential here so limiting it to just a frameId
seems overly specific; returning an object makes it more flexible for future usage without having to create specific APIs.
I agree with @fregante. It would be convenient
Following up on this after the 2022-11-10 meeting. @rdcronin's last comments still reflect Chrome's current position. We still need to resolve outstanding questions with our security counterparts before we can proceed with this feature request.
Hi @dotproto
@rdcronin's last comments still reflect Chrome's current position.
still ?
@oliverdunk is the current Google representative in this group.
@Rob--W
Thanks, I saw dotproto with the most recent action here, and got it confused.
I haven't heard of any new conversation around this and I don't think anything has changed which would influence our decision. That said, I can try to bring this up again.
<frame>
or <iframe>
element, to get its frameId (communication downwards)
This is doable but only through
webNavigation
permission (that many extensions are reluctant to ask for given the scary message browsers show) or non-blockingwebRequest
. In both cases, that's hacky and complicated for something that should be trivial. And we have absolutely nothing to fix the conclusion3 issue ATM.
@therealglazou can you elaborate on how you use webNavigation(.getAllFrames?) to retrieve a frameID for any
@addeective
I did this at a previous gig and started to put some rough documentation together before I left. There may be simpler ways to do it, and it may not be all that clear, as I didn't get around to finishing it, but it should serve as a pointer in a direction that worked:
https://github.com/interledger/web-monetization-projects/pull/3687/files
Where it says:
The request handler in the background page consults the BackgroundFramesService
You can just "consult" webNavigation.getAllFrames
@sublimator I used iframe.contentwindow.postMessage() but the listener registered in the content script hosted by the iframe is either always receiving the message by some iframes, either never received by other iframes. I have verified that the listener is active at the time of the postMessage by setting the name of the target iframe window (inside that same content script) , and then logging the name of the contentWindow at the time of the postMessage. Hence my question: is there a theoretical constraint inherent to an iframe preventing the reception of the message ?
I don't recall coming across any issues like that, but it's been many years since I gave it any real attention. Other than those notes, I don't have a lot else to help you with.
It's probably not timing issues, if you are getting consistent results. Cross origin restrictions? Do you see any commonality between the pages that don't work?
What did you set as targetOrigin ?
@sublimator
What did you set as targetOrigin ? targetOrigin '*' (unfortunately I need to communicate from any origin.)
It's probably not timing issues, if you are getting consistent results. Cross origin restrictions? Do you see any commonality between the pages that don't work? I'm on a blank src commonality lead at the moment.
If the iframe has other message listeners, they could be calling stopPropagation/stopImmediatePropagation and preventing your listener from being called. To make it work reliably you would need to make sure your content script runs before anything else on the page (document_start), and use "capture" in the event listener.
Yeah, we used document_start iirc
I'm on a blank src commonality lead at the moment.
an iframe without a @src set ? curious :)
@hamax here is my manifest:
{ "matches": [ "
"all_frames": true,
"run_at": "document_start"
}
@sublimator I did not introduce the scope of the project but I'm working on a research program which wants to track ads inserted in webpages. I assume those empty urls iframes are dynamic frames built by some advertisor javascript
I was about to say, "Have you tried setting match_about_blank: true", but I see ... Shrug, try setting up a controlled context with your own blank iframe and see if you can send it a message from the dev tools?
I was about to say, "Have you tried setting match_about_blank: true", but I see ... Shrug, try setting up a controlled context with your own blank iframe and see if you can send it a message from the dev tools?
@sublimator yes I need to dig deeper thanks!
my life would be so easier if we had the chrome.runtime.getFrameId ready :)
I hear ya!
@oliverdunk any chance you can give a kick to this feature? everybody has it except chrome since quite a while...
One of the problems we faced at Dashlane is dealing with frameIDs. It's currently super tricky to associate a frameID to a given
I think there should be an easy way to retrieve a frameID for a given iframe element, from the enclosing context of that iframe element. Similarly, from the JS context inside an iframe, there should be an easy way to retrieve the current frameID without having to ping-pong the background.