w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
605 stars 56 forks source link

API for extensions to exclusion/deny list their content scripts #653

Closed Robbendebiene closed 2 months ago

Robbendebiene commented 5 months ago

Basic Use Case: Users often want an exclusion/allow list functionality to disable add-ons on certain URLs.

Goal:

Attempts:

Block the content script injection:

Approach:

scripting.registerContentScripts() seems like a wonderful fit for this, because it provides the excludeMatches functionality wherefore no knowledge about the tab urls has to be exposed.

  1. Store the exclusions in browser.storage.
  2. Retrieve the exclusions in the background script.
  3. Inside the background script call scripting.registerContentScripts() (or scripting.updateContentScripts()) with the exclusions as excludeMatches to register the content script.

Problems:

Drawbacks:

Block the code inside the content script:

Approach:

  1. Register the content script via the content_scripts manifest key
  2. Store the exclusions in the browser.storage.
  3. Retrieve the exclusions in the content script.
  4. Inside the content script check whether whether the tabs URL matches any exclusion and run or not run the main content script's code.

Problems:

Drawbacks:

Conclusion:

So far only the latter approach really works, but for a seemingly simple functionality as whitelisting/blacklisting it is cumbersome to implement. Also it feels wrong to rely on content script code while the goal is to avoid any content script injection.

Possible solutions:

fregante commented 5 months ago

My solution for this has been this package for the past 9 years:

In short: it listens to new host permissions and then it registers the manifest scripts on these new hosts. You don't need the tabs permission for what you're asking, but only scripting and the specific host you want to inject into.

However this has been a pain point for me forever. I've been asking for a way to declare content scripts as "injectable on any websites" and let the browser handle permissions, registration and injection.

Extensions in Safari actually are awfully close to this if you declare a content script with *://*/* but then it bugs the user on every website because "this extension is requesting access to this website." I wish the notice was changed to "this extension can be enabled on this website"

Chrome even has UI for this already in place, but it's disabled/unclickable:

343063632-33548e82-22c1-48db-a95d-9a727ae90ab9
fregante commented 5 months ago

Regarding your specific requests:

  • Allow scripting.registerContentScripts() to inject into existing tabs and also unloading it when it becomes excluded via excludeMatches

You can't unload JavaScript once it's been run. CSS code is already unloaded in Firefox and Safari in this case (I think not in Chrome)

  • Provide the tab url regardless of the tabs permission if the add-on has host permissions for the respective tab

That already happens. You will receive the tab URL if you have permission to access the website. See the output of chrome.tabs.query({}) in an extension that only has access to Github.com: screenshot

Robbendebiene commented 5 months ago

My solution for this has been this package for the past 9 years:

In short: it listens to new host permissions and then it registers the manifest scripts on these new hosts. You don't need the tabs permission for what you're asking, but only scripting and the specific host you want to inject into.

However this has been a pain point for me forever. I've been asking for a way to declare content scripts as "injectable on any websites" and let the browser handle permissions, registration and injection.

Thanks for you answer. That looks like a decent solution for allow list scenarios. Unfortunately I'm more interested into the blocklist/exclusion scenario.

I would also prefer if this could be leveraged with the browsers host permission like your approach does. In fact the idea has been raised some time ago for Firefox: https://bugzilla.mozilla.org/show_bug.cgi?id=1745823 However it is still in a pretty undefined state with multiple open questions. Therefore enhancing scripting.registerContentScripts() sounded way easier to me. Also because at least one functionality I'm demanding has been requested multiple times and even seems to be something add-on developers expect "naturally" from the API.

You can't unload JavaScript once it's been run. CSS code is already unloaded in Firefox and Safari in this case (I think not in Chrome)

I'm not sure to which extend this is true. At least in Firefox if I remove/uninstall an extension the content script seems to be removed. This may be possible due to their X-Ray vision, but that is beyond my knowledge.

That already happens. You will receive the tab URL if you have permission to access the website. See the output of chrome.tabs.query({}) in an extension that only has access to Github.com: screenshot

You are right. My apologies. My tests case had an error.

Rob--W commented 4 months ago

As discussed in the meeting, it is infeasible to undo the injection in existing tabs. While browsers may be able to partially clean up the script, it is impossible to undo the DOM changes that a content script might have done. Cleaning up logic would therefore be the responsibility of an extension. Without this part, the most that we could read in this issue is the ability to customize excludeMatches of content scripts in manifest.json (so you aren't forced to use registerContentScripts). When interpreted in this way (which matches the title of the issue), at least Firefox and Safari are supportive of the capability (https://github.com/w3c/webextensions/labels/supportive%3A%20safari and https://github.com/w3c/webextensions/labels/supportive%3A%20firefox).

About the other parts of the request:

Browsers are currently inconsistent in whether scripts run in existing tabs. There is a feature request to control injection behavior at:

Note: even without first-class support for disabling content scripts, it may be possible to implement "skip content script functionality" if there is a way to customize parameters for content scripts (previously referred to as "globalParams"), e.g. as described at https://github.com/w3c/webextensions/issues/536#issuecomment-2200692043

fregante commented 4 months ago

I think that in practice this can be implemented in two ways:

I think the latter closely matches what has been asked here. I have two proposed APIs.

Block list (preferred)

browser.scripting.blockContentScripts({
    matches: ['https://example.com/*']
})

This would disable all content scripts on the specified matches. The request could be undone via an id attribute

browser.scripting.unblockContentScripts({
    ids: ['previously-specified-block-id']
})

This would work as an additional "block list" over the existing content scripts.

unregisterContentScripts support for manifest scripts

  1. Allow adding an id in content scripts specified in the manifet
  2. Allow de-registration of via the existing scripting.unregisterContentScripts() API

The advantage of this API is that you don't need to manage an additional "block list". The con is that creating a blocklist feature becomes quite verbose:

await scripting.unregisterContentScripts({
    ids: ['manifest-content-script', 'another-dynamic-content-script']
})

const manifest = browser.runtime.getManifest().content_scripts;
manifest[0].exclude_matches.push('https://example.com/*');

const dynamic = getMyDynamicScript();
manifest[0].exclude_matches.push('https://example.com/*');

await browser.scripting.registerContentScripts([...manifest, ...dynamic])
fregante commented 4 months ago

An additional solution would be for the browser to expose the block-list and let the user deal with it directly. This would work similarly to how browsers block the execution of scripts on specific pages (like the respective extension stores, etc)

For example in Safari it might extend the existing host list:

Screenshot 8
Robbendebiene commented 4 months ago

Thanks for the update.

the ability to customize excludeMatches of content scripts in manifest.json (so you aren't forced to use registerContentScripts)

How about allow setting the id property in the manifest like so:

  "content_scripts": [
    {
      "id": "my_special_cs",
      "matches": ["<all_urls>"],
      "run_at": "document_start",
      "js": [
        "script.js"
      ],
    }
  ],

Then allow updating it via scripting.updateContentScripts like so:

scripting.updateContentScripts({
   id: "my_special_cs",
   excludeMatches: [...]
});

If the developer wants to know which sites are currently blocked:

const [script] = await scripting.getRegisteredContentScripts({
  ids: ["my_special_cs"]
});
script.excludeMatches;

I have to admit though that this whole concept of persistently updating the manifest content scripts is probably undesirable as it becomes less and less clear what the manifest is actually doing after multiple calls to scripting.updateContentScripts. Also I suppose on a fresh start browsers are reading the manifest from scratch and do not have a persistent internal representation of it.

xeenon commented 4 months ago

@fregante Safari will offer to configure any open tab in a "Currently Open Websites" section, so you can block/allow sites individually. Allowing input of a pattern is likely too advanced for most users. Screenshot 2024-07-18 at 3 10 31 PM

oliverdunk commented 2 months ago

We discussed this during an in-person meeting at TPAC. There was general alignment that this is a use case we want to support and have seen developers solve in various ways. That said, there are some limitations to what we can do. For example, uninjecting a script isn't possible because there is no way to revert any changes it may have made to a page.

The main path forward we saw was adding a way to specify that you would like scripts to inject in certain conditions even if a page is already loaded. That matches the request in https://github.com/w3c/webextensions/issues/617 so we're going to continue discussion there, and I'm going to close this in favor of that issue.

fregante commented 2 months ago

I’m really confused about the progression of this issue. I don’t see anyone suggesting the “unloading” of scripts.

What was suggested here was a way to disable scripts that are declared in the manifest. #617 is completely unrelated to that.

Rob--W commented 2 months ago

I’m really confused about the progression of this issue. I don’t see anyone suggesting the “unloading” of scripts.

The request to unload is part of the feature request ("and also unloading it when it becomes excluded"), and you already called out before that this is not possible. During the meeting where this was discussed, I also said that unloading is not possible.

What was suggested here was a way to disable scripts that are declared in the manifest. #617 is completely unrelated to that.

617 relates to the remaining part of the feature request, "inject into existing tabs". The request there is for a manifest key, but any such content script specific feature would likely be ported to dynamically registered content scripts.

Note: when this issue was discussed first (https://github.com/w3c/webextensions/blob/main/_minutes/2024-07-18-wecg.md#meeting-notes), I noted that a way to support this would be to enable statically registered scripts to be updated. This idea resonated with some. If one is interested in having this idea pursued, please create a new issue focused on that specific solution/feature.