w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
578 stars 50 forks source link

Proposal: RegisteredContentScript.func and RegisteredContentScript.args (similar to ScriptInjection) #536

Open hackademix opened 5 months ago

hackademix commented 5 months ago

(As discussed in #103)

In order to reliably parametrize scripts at document_start avoiding races with page-provided scripts, RegisterdContentScript needs to support two additional properties, similar to the func and args properties of the detail argument in browser.scripting.executeScript().

Therefore, in addition to the RegisteredContentScript properties already specified, we propose to add:

ghostwords commented 4 months ago

Hi @hackademix! It might help to provide a code example that shows the general pattern of how one might use func and args to pass dynamic configuration into before-anything-else happens page context (MAIN_WORLD) scripts.

Rob--W commented 4 months ago

This is a concrete proposal to address #284. Typical usage is to declaratively register a content script with the config. That is currently not possible because only files can be passed, not code nor anything else.

The func+args parameters (that the scripting.executeScript API already offers) has sufficient flexibility for dynamic code execution, and also the desired characteristic of the parameters being visible to the executed code only, even in the main world, because a function automatically has a closure with local variables that are not accessible to other code in the execution environment. That is, until the code invokes methods or getters/setters/proxy traps on the prototype, because other code in the (main) execution environment may have spoofed them.

A minor issue with the func+args option is that it requires the code to be declared as a function in the background script (or any other extension document). The "args" feature does not work with "files", because when a file executes, there is no closure with access to local variables. Because any JS file can trivially be inlined and wrapped in a function literal to pass to the "func" property, this disadvantage is not that big of a deal.

A very interesting question is how func+args should work with the persistAcrossSessions:true option. For simplicity, we could reject this combination of options, until the desired behavior (of persistence across extension updates) is agreed upon and defined.

func+args is a small improvement over the status quo. It doesn't solve the full problem of configurable main world execution, because func+args are static and don't offer a way for the main world script to communicate privately with the extension. To support that capability, it would be necessary to offer a private handle whose prototype cannot be polluted by the main world, that can communicate with the background script or (another) isolated world.

hackademix commented 4 months ago

It might help to provide a code example that shows the general pattern of how one might use func and args to pass dynamic configuration into before-anything-else happens page context (MAIN_WORLD) scripts.

The func would be the immutable code to be recycled and whose behavior changes depending upon the configuration, while args can pass different configuration data which might vary depending on the document URL (via match and excludeMatch) or on the tabId.

A very interesting question is how func+args should work with the persistAcrossSessions:true option.

Since both func and args are already required to be JSON-serializable, it seems to me that supporting the current persistAcrossSessions semantics no matter if js, func or both are defined would not be too complicated, would it?

func+args is a small improvement over the status quo. It doesn't solve the full problem of configurable main world execution

Even if not perfect, maybe a dynamic per-session "secret" shared with ISOLATED_WORLD via func+args as well might help in inter-world messaging, like hinted in this example:

const mainWorldFunc= (secretEventSuffix, configObj) => {
  // 1. CustomEvent configuration: listening of `isolated${secretEventSuffix}`
  //     and  dispatching of  `main${secretEventSuffix}`
  // 2. Do DOM / JS environment stuff based on configObj
}
const isolatedWorldFunc = (secretEventSuffix, configObj) => {
  // 1. CustomEvent configuration;  listening of `main${secretEventSuffix}`
  //     and  dispatching of  `isolated${secretEventSuffix}`
  // 2. Do semiprivileged ISOLATED_WORLD extension stuff based on configObj
}

const scriptTemplate = {
  run_at: "document_start",
  allFrames: true,
  matchOriginAsFallback: true,
  excludedTabIds: await getDisabledTabs(),
  persistAcrosSessions: false,
};

// create or retrieve from session storage 

const secretEventSuffix = await createOrRetrieveSecretSuffix();
const settings = await getSiteSpecificSettings();
const scripts = settings.map(({settingsId, match, configObj}) => {
  let args = [secretEventSuffix, configObj];
  return [
   { world: 'MAIN', func: mainWorldFunc, },
   { world: 'ISOLATED', func: isolatedWorldFunc, }, 
 ].map(script => Object.assign(script, scriptTemplate, {id: `${script.world}_${settingsId}`, args,}))  
}).flat();

await browser.scripting.registerContentScripts(scripts);
Rob--W commented 4 months ago

A very interesting question is how func+args should work with the persistAcrossSessions:true option.

Since both func and args are already required to be JSON-serializable, it seems to me that supporting the current persistAcrossSessions semantics no matter if js, func or both are defined would not be too complicated, would it?

code was dropped fro the tabs API in MV3 / scripting API replacement because it would enable execution of arbitrary code. If we were to persist func across sessions, then unrelated code from a previous extension update can carry over to a next update. That makes it harder if not impossible to rule out unwanted behavior.

I therefore recommend to disallow persistAcrossSessions: true, or at least clear it on extension updates.

hackademix commented 4 months ago

unrelated code from a previous extension update can carry over to a next update. I therefore recommend to disallow persistAcrossSessions: true, or at least clear it on extension updates.

I understand and support the reasoning. Obviously I would prefer the persistence to be still guaranteed across browser restarts without updates, which would suffice to address this concern as you suggest.

Related and more general question: are there any guarantees that scripts dynamically registered from the service worker's top level at browser startup (or extension updates) - and blocking webRequest listeners in Firefox - are ready to reliably run without race conditions when the first web content page loads?

oliverdunk commented 1 day ago

I spoke to @rdcronin about this. We have previously discussed having a way to set parameters that can be accessed synchronously from a content script, e.g:

// In your service worker
chrome.scripting.setParams({ tabId, data: { message: "Hello World" } })`;

// In your content script
console.log(chrome.params.message);

There was also some early work that was never finished to expose a globalParams object (here). This proposal is definitely different to those but there is certainly some overlap in the problem space.

This proposal has some notable benefits, including that it would work in the main world. Anything based on globals wouldn't since we wouldn't expose the params on any globals in the main world.

There are also some drawbacks - for example to set different args for each tab or origin, you would need to register one script for each possible configuration. There are some workarounds here, e.g for origin based configuration you could store it all in params and do a lookup based on window.location. That's not ideal though.

On the Chrome side, we'd rather not implement two ways of doing the same thing, and ultimately think a way of registering params separate to the scripts themselves would be better. Given that we don't plan to implement this, and in fact would be somewhat against it being a part of the platform (given we hope there will be another, preferred way to do it in the future), I'm going to mark us as opposed to this proposal.

I do want to say that this does not in any way indicate we don't want to solve the problem. In fact, given how often this comes up, I personally want to see if we can build a good case for prioritizing the implementation of something that addresses these needs.