Open kzar opened 4 months ago
This is tricky to get right, since as soon as the page's scripts have run, you can no longer trust much. You end up having to keep references to any API you might use in the future, in case they are messed with later
This is most likely a problem better addressed by https://github.com/tc39/proposal-get-intrinsic. With ShadowRealm
you might be able to capture the original ShadowRealm
constructor if your script runs first, but any interaction across realm still requires code executing on each side, and the ShadowRealm's global only has very limited support for most Web APIs
This is an ongoing issue, especially since websites can sometimes use tricks to create an iframe that some browsers won't run the content script for.
You might be interested in https://github.com/WICG/proposals/issues/144
I am also interested in similar "init scripts" mechanisms but at the TC39 level.
- Browser extension content scripts can be run for ShadowRealms
I would very much hope that extensions do not gain that capability (the same they don't for Workers afaik)
- Methods can't be passed out of the ShadowRealm to the parent for use by the parent.
You can pass function, which get wrapped through the callable boundary, only allowing primitives and other functions as arguments / return values. There is no way to use these function to "apply" them to the current realm, but they can be called as normal, effectively allowing the use of any capability exposed by the ShadowRealm.
2. ShadowRealms can't directly do things like open WebSocket connections or read/write cookies.
These Web APIs are currently excluded from being exposed in ShadowRealms
such extensions will be stuck trying to wrap the ShadowRealm API as well,
That honestly may be your safest bet here. The biggest problem you'll encounter is how to synchronously evaluate some code inside the realm to "repair" the ShadowRealm global. Unfortunately the current evaluate
too easily trips up CSP rules.
Thanks that's very helpful, it sounds like there's not anything that needs to be adjusted in ShadowRealm
for the use cases I was thinking about here.
Thanks for the links as well, Reflect.getIntrinsic
in particular sounds interesting. Getting off topic here slightly, but do you know if the Reflect.getIntrinsic
API itself could be wrapped by a page/extension script? FWIW I'd prefer that it could be, so that the extension can stash a reference but then wrap it as necessary to avoid circumvention.
Any first run script that wants to virtualize / repair an environment would need to wrap getIntrinsic. One thing we'll need to look into is the difficulty that ShadowRealm creates in applying repairs like that in any new realm. Transitively wrapping the ShadowRealm constructor is tedious and as I mentioned raises problems with CSP
@kzar, note that running a content script inside a same-origin iframe won't help you because the embedder has direct synchronous access to the iframe's contentWindow right after the element is added to DOM and before the content scripts run inside at document_start. This eternal architectural bug affects all browsers, AFAIK, and was recently re-reported for Chrome in https://crbug.com/40202434. Resisting it is hard, especially now that synchronous mutation events are disabled in Chrome. You'd have to patch HTMLIFrameElement.prototype.contentWindow/contentDocument getters to process the case of an iframe inside a closed shadow DOM that doesn't expose the child window as window[0]
, but these getters won't help with light DOM as the page can just read window[0]
(it's not using a getter), so I guess you'd have to also patch all DOM prototype methods that can add an iframe like appendChild, append, prepend, before, after, insertAdjacentElement/HTML and innerHTML/outerHTML setters.
Yea, the contentWindow/contentDocument issue is a pain for extensions as well because websites sometimes abuse it to access unwrapped APIs. In the Adblock Plus injected wrapping code we also took care to also wrap contentWindow/contentDocument, so that our injected code could recursively inject itself into frames as contentWindow/contentDocument were accessed. There were likely more ways around it that we missed though!
BTW in addition to all DOM methods I listed above you'll also need to spoof window.open
for same-origin document because the creator can get the original stuff directly from the new window object and save them inside that window to be used later inside and/or outside.
Really interesting you all mention these, because that's exactly what we've been attempting to address with the Snow project - tap into all the JS APIs that can grant you access to new same origin realms (iframes, tabs, etc) and tame them to your liking.
Getting a sense of what I'm trying to say can be very easy by visiting the Snow demo app - give it a go.
I'm telling you this because after working on Snow for over 2 years, we arrived at the conclusion that implementing a solution to this virtually is somewhere between super hard and impossible (check out some of the open issues against the Snow project), and instead we are now trying to convince the industry to make this a builtin solution in the browser.
I'd hate to take this discussion to the wrong direction, so let me just mention that if you care about this problem too, feel free to participate on https://github.com/WICG/proposals/issues/144 - the more hands we get, the more likely the community will adopt this effort!
By reading all comments starting with https://github.com/tc39/proposal-shadowrealm/issues/406#issuecomment-2155407656 - I'm pretty sure https://github.com/WICG/proposals/issues/144 will address your concerns.
Thanks, you're right the RIC proposal does look interesting. I've left a quick comment about this use-case there.
The ShadowRealm API is new to me, sorry if any of these points are obvious or wrong. But I thought I'd write down my thoughts from the perspective of an extension developer in case it helps. I've been thinking about both how I might use the API from an ad blocker or privacy protecting extension, but also how websites might use the ShadowRealm API to circumvent such extensions.
Using the API
Sometimes these extensions need to run a content script in the page, before the page scripts run, in order to wrap troublesome APIs in an attempt to stop the website doing something. As an example, Adblock Plus needed to wrap the WebSocket API like this when it was new, before the
chrome.webRequest
API supported blocking such requests directly. This is tricky to get right, since as soon as the page's scripts have run, you can no longer trust much. You end up having to keep references to any API you might use in the future, in case they are messed with later. You even have to consider methods that might be implicitly called by your code. As an example, check out the old WebRTC wrapping code we wrote for Adblock Plus, to prevent websites using the API to load ads.Perhaps ShadowRealm could help with this kind of situation? If we could create a ShadowRealm at the start of our content script, perhaps most of the logic could go in there and only the messaging and code exposed to the page would need to be hardened? For this to be much use I think we'd often need a way to synchronously communicate between the ShadowRealm we created and the page.
Websites abusing the API
For privacy protecting and ad blocking extensions to be effective, we need our content script to run for all frames. Otherwise the page can make use of unwrapped APIs (e.g. for fingerprinting the user) by creating an iframe and then using the API from there. Sometimes websites will also pass a prototype's method back out from an iframe if they suspect we wrapped it, so that they can use it from the parent to try and get access to something in the parent. This is an ongoing issue, especially since websites can sometimes use tricks to create an iframe that some browsers won't run the content script for.
To prevent websites using these tricks from ShadowRealms I would hope that:
OR
Otherwise, such extensions will be stuck trying to wrap the ShadowRealm API as well, which is probably bad news for everyone involved 😅.
Hope that helps and shout if I can clarify anything! Dave