w3c / webextensions

Charter and administrivia for the WebExtensions Community Group (WECG)
Other
576 stars 50 forks source link

User scripts in Manifest V3 #279

Open dotproto opened 1 year ago

dotproto commented 1 year ago

PLACEHOLDER: We have a couple of issues related to user scripts support in Manifest V3, but nothing that currently feels like a good tracking issue. This issue intends to fill that gap by consolidating discussion of Manifest V3 API considerations in one place.

Rob--W commented 1 year ago

@dotproto (on behalf of Google / Chrome) announced the intent to support user script managers in the WECG meeting at TPAC (#232). The meeting notes have been merged in #277 and this topic can be found under "User script manager support in MV3" at https://github.com/w3c/webextensions/blob/main/_minutes/2022-09-15-wecg-tpac.md

I (on behalf of Mozilla / Firefox) previously stated that we'd like to support user script managers, ideally in a more secure way, at https://blog.mozilla.org/addons/2022/05/18/manifest-v3-in-firefox-recap-next-steps/#comment-227473.

EDIT: The API design is ongoing. Significant updates are listed here:

Rob--W commented 1 year ago

To recap from https://github.com/w3c/webextensions/blob/main/_minutes/2022-09-15-wecg-tpac.md, the proposed API is:

My concern with this proposal is that it re-introduces the ability to execute arbitrary code in a (somewhat) privileged execution context, against all efforts to remove this from MV3. In practice, given the absence of anything better, user script managers already execute (third party) user script code in a content script, which shares the extension APIs and some privileges with extensions. This is a suboptimal design.

I propose the following API to support user script managers, which is more secure by design:

Not security-related, but still relevant to consider in the API design:

jonathanKingston commented 1 year ago

My concern with this proposal is that it re-introduces the ability to execute arbitrary code in a (somewhat) privileged execution context, against all efforts to remove this from MV3.

I agree with this concern.

chrome.scripting.globalParams was added to solve similar use-cases however it's not ever exposed to the main world scripts. I'd instead propose a way to allow these variables to be passed into the script (perhaps cloned so that modification isn't possible) some avenues to inject these would be:

  1. document.currentScript.params
  2. import.meta.params
  3. Inject in a script local variable prior to execution of the script. (I believe all JS engines have the ability to easily register script local properties programatically)

I'd not expose these on globalThis/window as ideally the page would have no visibility of these parameters.

derjanb commented 1 year ago

@dotproto @Rob--W thanks for addressing this now.

Require explicit user consent to allow extensions to use this feature

Will this be automatically set to on for all existing Manifest v2 installations on update to the mv3 version?

Introduce a new world, "USERSCRIPT"

This sounds promising and will make securing a lot of functions and prototypes that can be altered by the page obsolete.

The USERSCRIPT is an isolated world (in the WebKit / Chromium sense) that shares the DOM with the web page. This means that code from both worlds cannot directly interact with each other, except through DOM APIs.

This is a problem. I'd say more than the half of the userscripts directly interact with the page's JavaScript objects and functions.

Tampermonkey at Firefox (BETA for now - the stable release will follow soon) for example uses Firefox's userScripts API which creates a context separated from the page (and the extension), but still allows access to the main world (unsafeWindow).

When multiple scripts match and have the same runAt schedule, scripts targeting ISOLATED are executed before USERSCRIPT.

:+1:

Open question: Which CSP to choose for this USERSCRIPT world?

Userscripts should not be limited by a "foreign" CSP (extension or main world) . They are made to do things that neither Tampermonkey nor the page expect. Ideally we could add a csp property additionally to the code property and Tampermonkey could either introduce a @csp userscript header that is forwarded to scripting.registerContentScripts or use existing tags to create a CSP specific to this userscript.

Sxderp commented 1 year ago

I second @derjanb concern with limiting interaction via DOM APIs. This limits what can be done in terms of userscripts. Being able to mess with the webpage JS is a key point.

I also liked how Firefox implemented the userScript API. A separate context was created but you could inject additional functions into it via a callback. If the communications between USERSCRIPT and ISOLATED is also going to be limited to DOM APIs isn't that also risky? Under this, I'm assuming that extension specific APIs are not available. How can a pre-shared secret between the userscript manager and userscript be used? The secret would need to be shared to all userscript authors? Maybe I'm misunderstanding but the more I look at this the more I think "can't webpages hi-jack this?"

Illya9999 commented 1 year ago

@Sxderp

can't webpages hi-jack this?

Yes. Webpages can run code before userscript managers would ever be able to by using iframes or popup windows. It has been used maliciously in the past and is now partially fixed in some userscript managers, however there is no complete 100% fix. The issue of webpages being able to execute code before extensions was reported to chromium some time ago and was marked as wontfix iirc

muzuiget commented 1 year ago

Currently we can execute arbitrary code in MAIN world on Chrome 107 with these code:

// background.js
chrome.scripting.registerContentScripts([
    {
        id: '0',
        matches: [
            'https://www.example.com/*',
        ],
        js: [
            'main-world-script.js',
        ],
        world: 'MAIN',
        runAt: 'document_idle',
    },
]);
// main-world-script.js
window.addEventListener('message', (event) => {
    // message from ISOLATED world content scripts
    const message = event.data;
    if (message.name === 'user-script') {
        const script = document.createElement('script');
        script.textContent = message.text;
        document.head.appendChild(script);
        script.remove();
        return;
    }
});

Most user scripts do not need to as complex as Greasemonkey-like scirpts:

Just like users execute scripts in the DevTools "Console" or "Source -> Sinppet", but use extensions to execute scripts automatically base on the page url.

We can create a lite version user script manager with Manifest V3 now.

But does the Chrome Store policy allow this usage?

EmiliaPaz commented 1 year ago

Hi everyone! My name is Emilia Paz and I am part of the Chrome Extensions team. I'll be working on providing user script support in the Scripting API for MV3, as discussed in this thread.

I started an API proposal doc that takes into account the points discussed here and in offline conversations. The goal is to align on an API design that works better for everyone. Please, feel free to comment on the doc or in this issue. Also, I'll be attending the next WECG meetings for further discussion.

Looking forward to working together

BlackGlory commented 1 year ago

Does the current proposal imply that a user script will only run on a page? What if a user script needs to run in a sandbox unrelated to a page?

tophf commented 1 year ago

I started an API proposal doc

Methods that contain Content are already misleading because they also deal with main-world scripts that aren't content scripts by definition, but page scripts: #313. If you want to separate methods by areas of concern then there should be Page methods e.g. registerPageScripts and so on, thus obviating the need for world parameter in Content and Page methods.

tophf commented 1 year ago

The main problem is that this API proposal is currently unusable in Violentmonkey or Tampermonkey because it's completely insecure due to https://crbug.com/1261964, to circumvent which these extensions currently have to extract the safe methods from a temporary iframe before running the main world scripts. It means that the main world script cannot call any standard method like addEventListener until the extracted methods are passed securely from this iframe in a manner that cannot be intercepted/spoofed by another main-world script.

Unless this bug is fixed or a secure communication method is provided, these extensions will become wide-open backdoors and I will suggest their authors to remove them from the web store. The method must be synchronous, by the way.

tophf commented 1 year ago

Another nasty problem is that the API proposal doesn't support @include and @exclude which are glob-based by default but can be RegExp as well, which is used by tons of userscripts and there's no good alternative for RegExp. With the current API shape any userscript with a RegExp include/exclude that runs at document_start will have to run on all sites and then the extension's content script will perform the check, which is extremely wasteful (due to the need to inject the script into every web page + the lack of code compilation cache for it) and can take a long time depending on the regexp.

brunoais commented 1 year ago

@tophf: I made some suggestion changes to the document. They haven't been approved but I think it's worthwhile for you to take a look at them and check if they suffice.

EmiliaPaz commented 1 year ago

Does the current proposal imply that a user script will only run on a page? What if a user script needs to run in a sandbox unrelated to a page?

User scripts will only be allowed to execute in the MAIN world (page) or a new USER_SCRIPT world.

Sxderp commented 1 year ago

Is there some way to provide additional functionality (GM.xhr and friends) to a user script? For example the Firefox userScripts.onBeforeScript let you run a callback to export functions into the execution environment. The current proposal has no such thing and thus the best alternative is to concatenate some fixed string of functions to the "code" argument?

brunoais commented 1 year ago

Is there some way to provide additional functionality (GM.xhr and friends) to a user script?

@Sxderp It's there. Just not much fanfare calling to it:

scripts targeting ISOLATED are executed before USERSCRIPT. This allows the extension to prepare the execution environment before the user script code is executed.

This is inside "Other considerations" sub-chapter

hanguokai commented 1 year ago

I'll be attending the next WECG meetings for further discussion.

I would suggest holding a separate topic-specific meeting rather than discussing it in a regular meeting. Browser vendors and scripts manager maintainers are invited to participate. (PS: I am not script manager author)

Having different types and methods than content scripts is beneficial in the following ways: ……

I recommend using a completely different namespace(e.g. chrome.userscripts.register()) so that they are in completely different document pages and developers don't get confused.

My other thoughts

What I understand is this proposal still allows arbitrary local and remote code to be executed in the main or userscript world of web pages, just don't allow to use extension api.

Besides extension and theme, can the web store support a third type of user script? All user scripts are submitted to the store and reviewed? Thus, user scripts are trustworthy and deterministic like extensions.

Compared to user scripts, I think content scripts should also allow arbitrary code execution. Because, when websites changed their web pages's structure/code/style, content scripts also need to be updated quickly to adapt to changes.(Usually, extenson developers do not own the site)

WaqasIbrahim commented 1 year ago

Besides extension and theme, can the web store support a third type of user script? All user scripts are submitted to the store and reviewed? Thus, user scripts are trustworthy and deterministic like extensions.

Not a good idea, one of the main benefits of userscripts is less oversight and you can self host them so updates or changes don't require long review processes. Also, most used userscripts that deal with YouTube, Spotify etc will never be allowed on chrome store.

Compared to user scripts, I think content scripts should also allow arbitrary code execution. Because, when websites changed their web pages's structure/code/style, content scripts also need to be updated quickly to adapt to changes.(Usually, extenson developers do not own the site)

This is not necessary, there are ways to observe changes on the page for example mutation observers.

7nik commented 1 year ago

What I understand is this proposal still allows arbitrary local and remote code to be executed in the main or userscript world of web pages, just don't allow to use extension API.

The extension needs additional permission for it. Thus the market can warn users that it's a potentially dangerous extension, also, reviewers will check if it's a valid usage of usercripts, like e.g. a weather app doesn't need it and it will very suspicious if it asks for it.

Besides extension and theme, can the web store support a third type of user script?

No. Sometimes you need US only for yourself, and sometimes even for a short period of time. Also, if a US need only a very limited number of people (<10), there is no reason to "publish" it. Also, there are in times, in dozens more US than extensions, and Google won't be glad to spend money on reviewing every single US.

Compared to user scripts, I think content scripts should also allow arbitrary code execution.

Enforcing security is one of the main goals in the MV3, and forbidding arbitrary code execution is one of the main ways to achieve it.

Because, when websites changed their web pages's structure/code/style, content scripts also need to be updated quickly to adapt to changes.

Sites do not roll up changes every single day, most of them even prefer to avoid this (it requires spending money and receiving usually angry feedback that things anymore aren't what the user used to, unless it's bug fixing). Also, you can in an empirical way find general and fuzzy approaches that are more or less resistant to minor changes in the page layout.

hanguokai commented 1 year ago

No. Sometimes you need US only for yourself, and sometimes even for a short period of time. Also, if a US need only a very limited number of people (<10), there is no reason to "publish" it.

I understand. In my previous comment, I said "script manager can supports run both trusted and untrusted code, but gives different user warnings.". In other words, no matter it is published to the store or not, users can run them.

Also, there are in times, in dozens more US than extensions, and Google won't be glad to spend money on reviewing every single US.

This question is best answered by the extension stores. I don't know what they think.

Enforcing security is one of the main goals in the MV3, and forbidding arbitrary code execution is one of the main ways to achieve it.

In this way, I think user scripts should be subject to the same restrictions as content scripts.

Sites do not roll up changes every single day, most of them even prefer to avoid this……

I often see extension developers complaining that a website update caused its code to go wrong, and they need to update the extension immediately.

EmiliaPaz commented 1 year ago

@tophf

Methods that contain Content are already misleading because they also deal with main-world scripts that aren't content scripts by definition, but page scripts: #313. If you want to separate methods by areas of concern then there should be Page methods e.g. registerPageScripts and so on, thus obviating the need for world parameter in Content and Page methods.

First off, chrome.scripting.getRegisteredContentScripts should have been chrome.scripting.getRegisteredUserScripts. My bad, fixed it.

As for the separate methods, correct me if I am wrong, you are suggesting having separate methods for user scripts, content scripts and page scripts (basically dividing content and page scripts) or either changing the current content script methods (issue #313). Content and page scripts behave similarly, have the same permissions enforcement and most importantly they don't support remote host code. Thus they can be grouped under the same methods (as they are currently) with different worlds. User scripts have to be clearly differentiated for better documentation and stronger enforcement of remote host code abuse. Thus the introduction of new methods. However, as @hanguokai points out, maybe is better having a new namespace.

As for your rename suggestion, I see the issue is listed as an Agenda discussion for public meeting on 2022-11-10 so better to follow up there.

The main problem is that this API proposal is currently unusable in Violentmonkey or Tampermonkey because it's completely insecure due to https://crbug.com/1261964, to circumvent which these extensions currently have to extract the safe methods from a temporary iframe before running the main world scripts. It means that the main world script cannot call any standard method like addEventListener until the extracted methods are passed securely from this iframe in a manner that cannot be intercepted/spoofed by another main-world script.

Unless this bug is fixed or a secure communication method is provided, these extensions will become wide-open backdoors and I will suggest their authors to remove them from the web store. The method must be synchronous, by the way.

I was not aware of this issue. I'll contact the OWNER and see what's the status on this. Thanks for pointing it out

Another nasty problem is that the API proposal doesn't support @include and @exclude which are glob-based by default but can be RegExp as well, which is used by tons of userscripts and there's no good alternative for RegExp. With the current API shape any userscript with a RegExp include/exclude that runs at document_start will have to run on all sites and then the extension's content script will perform the check, which is extremely wasteful (due to the need to inject the script into every web page + the lack of code compilation cache for it) and can take a long time depending on the regexp.

How are RegExp currently supported in MV2? I believe chrome.tabs.executeScript doesn't support @include and @exclude. If they are not currently supported, they may not be included on the API v1 since we are trying to achieve functional parity with MV2. However, do you have any suggestions on how this can be achieved?

Sorry for the late response everyone. I am getting myself familiar with user scripts and user scripts managers to provide better answers. Will continue shortly.

tophf commented 1 year ago

Content and page scripts behave similarly, have the same permissions enforcement and most importantly they don't support remote host code. Thus they can be grouped under the same methods (as they are currently) with different worlds

As I pointed out in #313 they are so different that you could say they're "worlds apart", pun intended, so conflating them is a very dangerous path that misleads developers into believing their data and code are safe from interception/spoofing/poisoning.

How are RegExp currently supported in MV2?

There's no support at all. The closest thing is RE2 syntax in filtered events of declarativeContent or webNavigation API.

Currently userscript managers explicitly use RegExp matching in JS. It doesn't cost much in ManifestV2 because the regexps are compiled just once in the persistent background script. Doing the same in a ManifestV3 service worker will be very costly due to the need to recompile regexps every time the SW starts for a new URL navigation, and it won't guarantee document_start timing.

I understand that Chrome won't add full RegExp to its API, so the next best thing is implementing RE2-compatible matchesRegex and excludeMatchesRegex and a method to test compatibility similar to chrome.declarativeNetRequest.isRegexSupported, so that userscript managers can simplify the userscript's regex for the API and then narrow it down inside the content script using the full RegExp check, thus reducing the wasteful injections.

Rob--W commented 1 year ago

@EmiliaPaz Thanks for sharing your proposal! I'll share my feedback here, because it's difficult to trace or reference discussions on Google docs, especially on a document that's still undergoing revisions.

Your stated goal of the document is "Achieve functional parity with MV2 for user script managers.". I note that just MV2 parity is a low bar. Due to the lack of supporting APIs from the browser, user script managers are built insecurely. Improving that would be great.

I'll describe the needs of user script managers in more detail, particularly focused on security and feasibility. After that I'll summarize the requirements as a starting point for the API design. I am intentionally focusing on parts that cannot be implemented by user script managers themselves, as these should be the top priority. E.g. regex matching can be implemented by user script manager code themselves, and support for that could be a future enhancement.

Background

There are many user script managers (USM hereafter), with the most popular ones being Greasemonkey, Tampermonkey and Violentmonkey. This section describes how they work, and the core primitives that they rely on today. Other than the "userscript" world, all concepts mentioned here already exist in MV2 today.

User scripts run on top of existing web pages, and need to access the DOM and/or JS variables in the page's context and/or semi-privileged user script APIs:

The following questions are relevant for the choice of world for a user script:

need page access need GM access Ideally, the world to run in World used by MV2 USM
no no User script world Main world or isolated world (of content script)
no yes User script world, with prepared environment Main world or isolated world (of content script)
yes no Main world Main world
yes yes Main world (or user script world + supporting code in main world) Main world

When semi-privileged APIs are not needed, a USM only needs the ability to run some given string as JavaScript code. Ideally in a world distinct from the page/content script, i.e. the "userscript" world proposed in https://github.com/w3c/webextensions/issues/279#issuecomment-1255843961. When semi-privileged APIs are needed, but page access is not, likewise (with some extra work from the USM to expose the APIs).

⚠️ When both accesses are needed, the current APIs are insufficient. This is due to the fact that the execution environments of web pages are untrusted. Untrusted web pages can intercept and change the behavior of all global and built-in methods, which makes it near-impossible to securely expose semi-privileged APIs to user scripts in the main world.

⚠️ USM currently try to work around the restrictions by running code in very early in the main world (at document_start), that saves all necessary (DOM/JS) API references in local variables, under the assumption that the web page has not had a chance to modify the execution environment. This has limited success, because the assumption is not always valid (e.g. https://crbug.com/1261964 mentioned by https://github.com/w3c/webextensions/issues/279#issuecomment-1303223390).

This is generally the structure of any script that introduces semi-privileged functionality to the main world. It itself runs in the main world (the concept is also relevant outside user scripts):

(function(secrets) {
  // Step 1: Save a reference to all standard DOM API methods, to avoid tampering by the web page.
  // Step 2: Use |secrets| and DOM APIs to create a semi-secure communication channel, e.g. using `CustomEvent`.
  // Step 3: Via this semi-secure channel, retrieve necessary data such as user script metadata.
  // Step 4: For *each* user script, create the GM API interface and execute the user script:
  somehowExecuteUserScript(localVarWithGMAPI, window, etc);
})("secret communication ID here");

⚠️ The snippet above requires a random secret variable, available only to the two ends of the communication channel: the above code in the main world, and the other side in the (privileged) content script.

Ordinarily, web pages and also scripts in the main world can use window.eval to execute code. This can however not be relied upon because it may be blocked by the page's CSP (script-src without 'unsafe-eval'). The actual content script code is therefore executed by inserting an inline <script> element containing the definition of a user script (this is blocked in MV3 due to the CSP for content scripts that blocks remote code):

window.someSecretRandomName = function(GM, unsafeWindow, etc) {
  /* actual user script code here */
};

The user script is wrapped in a function, because it needs a closure with a reference to the semi-privileged API, which it receives when called by the previous script that starts with (function(secrets) {. ⚠️ Because of the reliance on the global, web pages can intercept read/writes to that value, and therefore get a reference to the semi-privileged API exposed to it. Or read the source code of the user script.

Identified requirements

The following required primitives can be derived from the above overview of how user script managers work:

@EmiliaPaz 's proposal in https://github.com/w3c/webextensions/issues/279#issuecomment-1302762663 covers the first two aspects, but not the other two.

There are many different ways to implement these requirements. I will show two examples, one userscript-specific API (from Firefox) and a more generic API that has broader applicability.

Final note: the above requirements are minimal. If the goal is to move the responsibility of matching and triggering script execution from the extension to the browser, then there should also be a way to register some metadata for a script that is synchronously available. It would also be nice if there is a way for content script to veto the execution of a user script. These can be supported in the two API designs below.

Firefox's userScripts API (MV2)

Firefox's userScripts API was mentioned before in https://github.com/w3c/webextensions/issues/279#issuecomment-1261954577. I'll briefly describe the relevant parts of the API design for inspiration.

Features of the userScripts API:

In Firefox, it is possible to have asymmetric access relationships between code from different execution environments (Xray Vision). This mechanism enables user script managers in Firefox to prepare the environment beyond what's possible with defineGlobals, by assigning values directly to the execution context of the user script. User scripts cannot read/write values in the opposite direction; user script managers would have to use the export API passed to the onBeforeScript method to export the value to the user script.

The api_script from this API is created once per process as an optimization. It is not strictly necessary, so if desired this characteristic can be dropped. The onBeforeScript event can be added to a content script instead.

Note: we have marked the userScripts API as MV2-only in Firefox because the script registration mechanism is incompatible with the lifetime characteristics of background scripts in MV3. Furthermore, we would like to establish a cross-browser-compatible API for user script managers.

Scripting API with cross-world communication

To avoid confusion, this example extends the current scripting API, not the updateUserScripts API sketched in @EmiliaPaz's proposal.

// Somewhere in the extension, e.g. extension page where the user edits the user script.
chrome.scripting.registerUserScripts({
  code: `
    // This is what the user script manager generated:
    let GM = {
      info: GM_info,
    };
    let unsafeWindow = window; // In the main world, unsafeWindow is the window.
    let someRandomNumber = args[0];
    args = undefined;
    // This is the actual user script:
    console.log(hello, GM.info(), unsafeWindow);
    // ^ note: console is a global variable. The web page could have tampered with it...
  `,
  externalArgs: ["hello", "GM_info"],
  args: [4], // <-- someRandomNumber
  id: "my-userscript",
  world: "MAIN",
  matches: ["https://example.com/*"],
});

Internally, the browser could convert the above to the following:

function(args, hello, GM_info) {
/* the code string here */
}  // TODO: browser should find the hello & GM_info args and call this function

Then, before the browser runs the user script, it asks the content script for the API definition. For example:

chrome.scriptingContent.onArgRequest.addListener(arg => {
  if (arg === "hello") {
    // We must at least support primitive values (string, number, etc).
    return "hello world";
  }
  if (arg === "GM_info") {
    // We must also support functions. Potentially async (i.e. returning a Promise).
    return async () => {
      // The functions themselves may return values. TODO: define acceptable return values.
      return { plain: "object", is: "safe", but: "what", about: Function ? "callable properties" : "?" };
    };
  }
});

To maximize utility, I suggest to support at least the following return types:

The above description consists of standard types that are well-supported across browsers. Objects with getters, setters or function properties are not supported.

Note: the above is intentionally generic to cover broader use cases such as #284.

EmiliaPaz commented 1 year ago

@Sxderp

Is there some way to provide additional functionality (GM.xhr and friends) to a user script? For example the Firefox userScripts.onBeforeScript let you run a callback to export functions into the execution environment. The current proposal has no such thing and thus the best alternative is to concatenate some fixed string of functions to the "code" argument?

Is XHR necessary or can it be shimmed using fetch? Key constraint is XHR supports blocking requests.

EmiliaPaz commented 1 year ago

Another nasty problem is that the API proposal doesn't support @include and @exclude which are glob-based by default but can be RegExp as well, which is used by tons of userscripts and there's no good alternative for RegExp. With the current API shape any userscript with a RegExp include/exclude that runs at document_start will have to run on all sites and then the extension's content script will perform the check, which is extremely wasteful (due to the need to inject the script into every web page + the lack of code compilation cache for it) and can take a long time depending on the regexp.

Do you, or someone else, know how close do we get with:

EDIT: I just realized @Rob--W commented, and included regex. Will go over it and comment back

tophf commented 1 year ago

Is XHR necessary or can it be shimmed using fetch? Key constraint is XHR supports blocking requests.

Clarification: it's GMXHR i.e. GM_xmlhttpRequest (GM.xmlHttpRequest), not XMLHttpRequest.

GMXHR cannot be polyfilled (inside the web page process) because it's cross-origin, has access to the Forbidden HTTP headers, all cookies, all response headers including Set-Cookie, and runs in extension background context.

GMXHR does not support synchronous operation in modern USM, so it's not a concern.

In Tampermonkey/Violentmonkey GMXHR uses XMLHttpRequest as the backend, but it's possible to switch to fetch mode (already implemented in Tampermonkey) at the cost of losing onprogress events or significantly complicating its implementation (this is a known problem of the Fetch API design not unique to USM).

Just matches (excludeMatches)

Covers 50% or less of userscripts, I guess. This is the current shape.

matches + globs

Covers 95% of userscripts, I guess.

matches + regex

Covers 55% or less of userscripts (50% from 1 + 5% estimate for regex from 2)

matches, globs, and regex

Covers 100%, assuming it's the full JS RegExp. For RE2 see my comment above.

derjanb commented 1 year ago

These numbers are in the correct order of magnitude.

From the userscript.zone database, which lists the most popular publicly available scripts:

Just matches

59,7%

matches + regex

64,2%

matches + globs

95,1%

only regex (for reference)

4,9%

Note: userscript.zone uses RE2 because I had strange out of memory issues with RegExp quite often and IIRC only a few scripts didn't work with RE2 syntax. I can find out the real numbers if necessary.

Rob--W commented 1 year ago

Important implementation note given the above discussion on matches + globs. Just declaring matches and globs together in the userscript definition and then re-using the same internals that the browsers already have for content scripts does NOT result in a usable API for user script managers.

brunoais commented 1 year ago

In my opinion, globs can easily be translated to RegEx. So, just having a single RegEx input for includes and a single RegEx input for excludes (exceptions to the include) should be enough to implement all cases.

Ofc, if there's more options, even better!

tophf commented 1 year ago

globs can easily be translated to RegEx

Yes, but that consumes a lot more memory and CPU, so it's not a good thing for ManifestV3 which is also about performance.

brunoais commented 1 year ago

The largest use of CPU with "decently made" Regex is compilation. I think the only exception is explosive Regex.

erosman commented 1 year ago

Important implementation note given the above discussion on matches + globs. Just declaring matches and globs together in the userscript definition and then re-using the same internals that the browsers already have for content scripts does NOT result in a usable API for user script managers.

* User script managers expect matches (`@matches`) and globs (`@include`) to work as a logical disjunction (OR) - the script runs if either is matched.
* Chrome (and Firefox)'s [content script APIs](https://developer.chrome.com/docs/extensions/mv3/content_scripts/#matchAndGlob) currently treat `matches` and globs as a logical conjunction (AND) - the script runs only if both are matched.

Indeed, that is an important consideration when using a register type API. The issues has been experienced in FireMonkey which AFAIK ~is~ had been the only USM using the dedicated userScripts API.

In order to support Glob @include, FM has to manually add a catch-all matches (i.e. matches: ['*://*/*', 'file:///*']) which can result in unexpected outcome when there are mixed @matches/@exclude-matches & @include/@exclude in a userscript registration.

derjanb commented 1 year ago

[...] which AFAIK is the only USM using the dedicated userScripts API

FWIW: Tampermonkey lately uses the userScripts API as content script replacement to allow userscripts to overcome (all remaining) CSP issues without executing them in the content script world and to allow userscripts to run in a clean execution environment if wanted. TM also uses <all_urls> to support @include.

In my opinion, globs can easily be translated to RegEx.

As well as matches. :-)

erosman commented 1 year ago

FWIW: Tampermonkey lately uses the userScripts API as content script replacement to allow userscripts to overcome (all remaining)

I stand corrected... I was not aware :)

tophf commented 1 year ago

Another userscript feature that's going to be broken by ManifestV3 is GM_addElement, which allows userscripts to add a script, style, link elements bypassing the CSP of the page to add their own external stylesheets or additional code that needs to run in the MAIN world (to monkeypatch some web platform API) when the userscript itself runs in an isolated world.

It works in Chrome (not in Firefox) because this is how Chrome's ManifestV2 CSP for content scripts works. It doesn't work like this in ManifestV3, so the users will have to disable CSP of the page entirely, thus reducing the security, because there's no way for a ManifestV3 extension to modify an existing CSP and just add a nonce exception for itself - neither via the HTTP header in declarativeNetRequest nor the HTML-embedded <meta http-equiv="Content-Security-Policy" content="..."> equivalent.

Cerberus-tm commented 1 year ago

Another userscript feature that's going to be broken by ManifestV3 is GM_addElement, which allows userscripts to add a script, style, link elements bypassing the CSP of the page to add their own external stylesheets or additional code that needs to run in the MAIN world (to monkeypatch some web platform API) when the userscript itself runs in an isolated world.

It works in Chrome (not in Firefox) because this is how Chrome's ManifestV2 CSP for content scripts works. It doesn't work like this in ManifestV3, so the users will have to disable CSP of the page entirely, thus reducing the security, because there's no way for a ManifestV3 extension to modify an existing CSP and just add a nonce exception for itself - neither via the HTTP header in declarativeNetRequest nor the HTML-embedded <meta http-equiv="Content-Security-Policy" content="..."> equivalent.

Wouldn't it be better is userscripts were able to bypass the page's CSP anyway (possibly requiring a special 'GM_BypassCSP")?

erosman commented 1 year ago

Here is a simple way of looking at the userscript environment.

Normally, there are 3 distinct JavaScript contexts in an extension:

There are already safeguards in place to govern above.

Userscripts require an additional context e.g. userscript

brunoais commented 1 year ago

Wouldn't it be better is userscripts were able to bypass the page's CSP anyway (possibly requiring a special 'GM_BypassCSP")?

Userscripts should just be able to transparently ignore CSP in order for them to run

erosman commented 1 year ago

Security: Userscript and CSP/CORS

There are a number of considerations for userscripts when dealing with page CSP & CORS.

CORS

CSP

Sites setting strict CSP rules:

Injection by Userscript


Footnote

Traditionally, GM XHR/fetch is performed from the background script where the request is sent without any origin. Bearing in mind the removal of XMLHttpRequest in background service worker in Chrome in MV3 (ref: Cross-origin XMLHttpRequest), if the apiScript (not the userscript) could send XHR/fetch without origin, the dependency on the background script could be reduced and most GM functions could be processed in the apiScript.

dotproto commented 1 year ago

In an effort to help focus the discussion, I'd ask that member share feedback in GitHub issues rather than in linked docs. Generally speaking docs are a point-in-time reference, not canonical resources. As such, comments in Google Docs should be limited to minor clarifying questions and suggestions should be limited to typos and other cleanup changes.

Moving forward, I will be closing out doc comment threads and suggestions that are too large to be handled in the doc. Where appropriate, I will ask the contributor to move the comment over to GitHub issues. In a few moments I will start close out most comments in Emilia's API proposal.

dotproto commented 1 year ago

@brunoais, you have a number of comments and suggestions in Emilia's proposal that may be better suited for longer discussion here or an new doc to iterate on this proposal. As I mentioned, I'm going to close out comments to keep Emilia's proposal more focused. If you wish to reference these comments/suggestions, I created a copy of the doc with all comments/suggestions before I started closing them out.

brunoais commented 1 year ago

@dotproto I can't find my change suggestions to move them here. I also can't find them in the clone document you did. Where can I find the suggestions I made so I can place them here for discussion?

hanguokai commented 1 year ago

Friendly reminder: Today, there is special WECG meeting for User Scripts. 15 November 2022, 08:00 - 08:55 America/Los Angeles.

hanguokai commented 1 year ago

Whether it is useful or not, I would like to describe the positions of all parties:

brunoais commented 1 year ago

[removed offtopic]

  • Website Owners: I don't like third-party injected code, is there any way to limit them?

Interesting... As a website owner, I actually welcome them. One of the main things I can welcome is more features and less work for me and users can just choose what extra features they want, at their risk, by installing the userscripts they want. Are website owner complaints common enough to cause that?

erosman commented 1 year ago

People often mistake extension code with user code.

USM simply facilitates users to run scripts chosen by the users.

An installed USM does nothing, runs no remote code, even if left forever.

It is the user that instructs the USM to run the script selected by the user. It is purely users' choice which must be honoured.


Footnote

While most of the discussion has been about userScripts, userStyles/userCSS are also part of this discussion.

hanguokai commented 1 year ago

@erosman Your comment doesn't conflict with what I said. I didn't say USM is not extension.

hanguokai commented 1 year ago

@erosman I clearly distinguished between user script and user script manager. In that statement, I said "user script", not "user script manager". User scripts are written by user script developers(usually not users themself).

erosman commented 1 year ago

@hanguokai You are right. I misread. I thought you were referring to the userscript managers. Sorry about that. :manfacepalming: (I removed the post that was made in error to tidy-up the topic.)_

I have heard the "USM vs other extensions" argument so many times that my mind went straight to it.

Back to the subject in the comment..... user scripts

Therefore, any such comparison between an extension & a userscript is irrelevant.

Please note:

hanguokai commented 1 year ago

Therefore, any such comparison between an extension & a userscript is irrelevant.

When the extension meets the following conditions, I think user script and extension are the same: