Proposal: Add `ancestorMatches`/`excludeAncestorMatches` in scripting API

gorhill commented 3 months ago

Proposal: Add ancestorMatches/excludeAncestorMatches as conditions in RegisteredContentScript of scripting API

Background

For similar reasons as to the existence of allowAllRequests in the DNR API, there is a need to exclude content script injection in subdocuments when a specific ancestor document is being excluded from filtering.

Example of the issue, let's posit:

a.com is a main document loaded in a browser tab
b.com is a main document loaded in another browser tab
m.com is a frame embedded in a webpage from a.com and a webpage from b.com
n.com is a frame embedded in a webpage from m.com

Content script m_and_n.js is registered through the scripting API to be injected in documents and subdocuments from m.com and n.com:

{
    ...
    allFrames: true,
    js: [ "m_and_n.js" ],
    matches: [ "*://*.m.com/*", "*://*.n.com/*" ],
    ...
}

User chooses to exclude a.com from being filtered.

A DNR allowAllRequests rule for a.com allows to exclude DNR fitlering for all resources pulled by a.com, including those from embedded documents from different origin.

However, this does not work for injected content scripts: we can't unregister content script m_and_n.js to accommodate exclusion of a.com from filtering because this would also cause exclusion of m_and_n.js for b.com (and all other origins), which is not meant to be excluded from filtering.

Not being able to exclude a registered content script from a specific ancestor domain leads to unsolvable issues in uBO Lite extension, see https://github.com/uBlockOrigin/uBOL-home/issues/20

Rob--W commented 3 months ago

Would you still need this feature if Firefox were to support location.ancestorOrigins? Chrome already supports this, Firefox support may be added in the future at https://bugzilla.mozilla.org/show_bug.cgi?id=1085214

MDN: https://developer.mozilla.org/en-US/docs/Web/API/Location/ancestorOrigins

gorhill commented 3 months ago

Would you still need this feature if Firefox were to support location.ancestorOrigins?

I wasn't aware of this property.

However I don't see that it is usable in the current issue in uBOL because the injected content scripts are static and declaratively injected, and as such the content script code has no knowledge of which origins are currently being excluded from filtering by the user.

Rob--W commented 3 months ago

There are ongoing discussions here about a way to dynamically declare values to become available synchronously to content scripts, e.g.:

Even without new APIs, it is already possible to fetch state dynamically with synchronous XHR and configure the response with declarativeNetRequest.updateDynamicRules/updateSessionRules (with defaults provided via static rules). I admit that this is a hack, but potentially viable for infrequently used configuration. E.g. the NoScript extension uses this technique (well synchronous XHR+webRequest, not specifically DNR). Also you may need a way to have a sufficiently secret URL that cannot be guessed by web pages to avoid information leakage.

In any case, back to the original question: would a way to synchronously retrieve values in content scripts + location.ancestorOrigins be a satisfactory solution to your use case?

gorhill commented 3 months ago

would a way to synchronously retrieve values in content scripts + location.ancestorOrigins be a satisfactory solution to your use case?

This specific comment in the proposal aligns more with solving the specific issue I am facing in uBOL:

Need to register one single list of excluded origins, accessible to all injected content scripts, whether in MAIN or ISOLATED world without leaking information
Regarding the code example provided, chrome.scripting.setParams({ tabId, data: { message: "Hello World" } }), uBOL wouldn't want to use tabId since the content of data needs to be available to content scripts everywhere, so tabId being optional would work

In such case, then yes, I wouldn't mind uBOL's content scripts do the exclusion work, there is already such work done (for example).

hackademix commented 3 months ago

Even without new APIs, it is already possible to fetch state dynamically with synchronous XHR and configure the response with declarativeNetRequest.updateDynamicRules/updateSessionRules (with defaults provided via static rules). I admit that this is a hack

A hack now causing some pages to break badly. Just reported for the first time, seems to affect recent Firefox. Hell of a coincidence :(

Rob--W commented 2 months ago

Discussed in today's meeting, meeting notes pending review at #671.

One question is on use cases - what is the extent of the anticipated use of this requested API? If it is specialized and use cases relatively minor, location.ancestorOrigins sounds sufficient.

gorhill commented 2 months ago

what is the extent of the anticipated use of this requested API?

I would say the rationale for supporting ancestorMatches/excludeAncestorMatches is essentially the same as the one to support DNR's allowAllRequests? This is to support disabling content scripts in embedded contexts according to whether content scripts in an ancestor are meant to be disabled.

carlosjeurissen commented 1 month ago

A big benefit would be present in the case of style injections declarative in frames with specific ancestor matches. It would be beneficial to have a way to specify the extensions origin as allowed ancestor origin.

Currently in some extensions I would need to set a global css flag (like a class on the html tag) using JavaScript to make sure it is only applied when intended. It would also save a lot of unnecessary injection of content scripts in top level documents. This would address https://github.com/w3c/webextensions/issues/117

w3c / webextensions

Proposal: Add `ancestorMatches`/`excludeAncestorMatches` in scripting API #668

Background