Open 105th opened 1 year ago
Thanks for writing this up! It definitely seems like a use case that we haven't fully solved yet, and I'm looking forward to continuing to discuss it.
I came across this page which helped me understand the motivation for CSS rules beyond element hiding: https://adguard.com/kb/general/ad-filtering/create-own-filters/#cosmetic-css-rules
Per-site CSS rules was once implemented but later deprecated.
The @document CSS at-rule restricts the style rules contained within it based on the URL of the document. It is designed primarily for user-defined style sheets, though it can be used on author-defined style sheets, too.
Rules could be applied with url()
, url-prefix()
, domain()
, media-document()
, and regexp()
.
Firefox supported above initially under @-moz-document
.
See also: Per-site user style sheet rules
Let me please address a few comments from the meeting minutes.
I am much more concerned about scriptlets than about CSS rules and the reason is simple: using scriptlets is the only way to get rid on many websites, the most prominent one is Youtube.
[rob] A CSS selector can easily match everything; how would that reduce the required permissions? Effectively the proposal with JS would execute JS everywhere.
@Rob--W Regarding JS, please see the explanation below, we do not suggest allowing arbitrary JS.
[simeon] In the proposal as written, there are placeholders for scriptlets, but nothing in the API to register scriptlets. But as Tomislav mentioned, it's probably best to defer the scriptlets to the future.
@dotproto good catch, the proposal indeed does not mention one of the main points. We do not propose to allow developers register scriptlets. On the contrary, scriptlets should only be provided by the browsers themselves, this is the only way make it safe to use.
Kind of like what Mozilla does with shims used by tracking protection: https://searchfox.org/mozilla-central/source/browser/extensions/webcompat/shims
We once opened a similar feature request for WebKit, it explains why they're required and I still hope WebKit devs will get back to this and consider it: https://bugs.webkit.org/show_bug.cgi?id=225861
Note, that a scriptlet can come with a set of limitations. For instance, set-constant
does not allow setting arbitrary values, only numbers/booleans.
[rob] A CSS selector can easily match everything; how would that reduce the required permissions? Effectively the proposal with JS would execute JS everywhere. [timothy] It is part of our Content Blocking API, a display:none CSS rule can be applied if the domain, etc. matches. We restrict it to display:none for privacy reasons, anything more, even color changes is not possible. The implementation is optimized using the same mechanism that we also use to block network requests (and backs Safari's declarativeNetRequest API). [timothy] I wouldn't want to support arbitrary CSS without additional permissions. If visibility:hidden is common we can consider that, but anything more than that or display:none.
@Rob--W @xeenon @dotproto
Those are all valid points, arbitrary CSS can indeed be dangerous.
Our own use case is rather limited and does not require CSS to be arbitrary, a subset of allowed CSS properties would suffice.
Here're some examples:
overflow
is very often required to solve issues with blocked popups that at the same time disable scroll on the page. Surprisingly, seems to be the most popular property, about 1900 rules in AdGuard filters.display
, visibility
- we sometimes need to "unhide" something and not just hide.background
, background-image
- we need an option to remove background as it is often used for ad placements. No need to set custom background, just removing it.[simeon] It's a bit trickier than that. This was a consideration for CSP in content scripts in Chrome. One of the concerns with remote CSS is data exfiltration through selectors matching input fields for example. Just worth noting that arbitrary CSS can be more dangerous than it seems.
@dotproto
Could it be that you're talking about using content
property in addition to these selectors or maybe background, etc? The point is that there is only a limited number of properties that can be used to exfiltrate arbitrary data and they can be restricted in the API spec.
[simeon] Curious about browser vendors' perspective. Some DNR actions (block, upgradeScheme) do not require host permissions, but others (modifyHeaders) require host permissions. Should this pattern be followed?
@dotproto the problem with this point is that when an extension has host permissions, it can achieve the same result with a content script. With DNR the situation is different, we don't have any alternative way to implement the required functionality in an MV3 Chrome extension.
[rob] Chrome's DNR API automatically hides some elements (e.g. images) when a request is blocked. How does that work in Safari, and how would that play with this API? [timothy] Not aware of that. Safari does not do that. [rob] Firefox's DNR implementation does not do that either.
@Rob--W @xeenon This behavior was one of the first things requested from Chrome team when DNR was introduced. Please consider doing that.
We discussed this during the previous meeting and I was asked to provide some scriptlets examples.
First of all, regarding scriptlets, we propose for browsers to provide a small library of declarative "shims" that will be injected into the page. The pages where they will be injected should be defined in a declarative way with an API similar to DNR or maybe the DNR itself.
In AdGuard a scriptlet rule looks like this:
domain1.com,domain2.com#%#//scriptlet("scriptlet name", "argument1", "argument2")
uBlock Origin uses a similar concept but with a slightly different syntax:
domain1.com,domain2.com##+js(scriptletName, argument1, argument2)
Here's a list of scriptlets which cover ~80% of existing rules (each linked to its description):
That there are thousands scriptlet rules in AdGuard and uBlock Origin filters, here are just a few examples. Please let me know if you need more.
json-prune
youtube.com,youtube-nocookie.com##+js(json-prune, [].playerResponse.adPlacements [].playerResponse.playerAds playerResponse.adPlacements playerResponse.playerAds adPlacements playerAds)
YouTube loads video metadata JSON alongside ads metadata in a single request. This rule removes parts of the JSON that contain ads meta. The json-prune
scriptlet overrides two functions in order to intercept those JSON's:
JSON.parse
Response.prototype.json
set-constant
youtube.com,youtube-nocookie.com##+js(set-constant, ytInitialPlayerResponse.adPlacements, undefined)
When you load a YouTube page with a video for the first time, there's a JSON object ytInitialPlayerResponse
initialized inside an inline script. This object contains ads metadata which this rule removes.
abort-on-property-write
[many domains...]#%#//scriptlet("abort-on-property-write", "_pop")
Aborts a popular script for popup domains. They use random domains and this scriptlet takes care of it for good even when domain is not blocked yet.
Example: gledajcrtace.xyz
abort-on-property-read
[many domains...]#%#//scriptlet("abort-on-property-read", "BetterJsPop")
Aborts another very popular script to show popup ads. Usually, used as an inline script.
Example: https://upvideo.to/v/jfiqnfdkwqpd
Mozilla is generally in favor pursuing this, while understanding that there's lot of details here that need to be worked out. At least the simpler/safer CSS part, and splitting the script part into a separate issue.
We're definitely interested in this from the Chrome side as well - although it may not be something we work on short term. At the moment it feels like this would make more sense as a separate API vs. an addition to DNR, since this does not operate at the network level and likely has some different requirements. That's something we can figure out though as we build up some use cases and desired functionality.
The issue was discussed during the WECG in-person meeting.
Apple folks would like to write a formal proposal.
Forgot to add one more thing that was also discussed.
Chrome's stance on this issue is basically: "we like it, but we don't have resources to implement it short term".
Once the formal proposal is there, we (AdGuard) want to write a cross-browser polyfill of this new API so that developers could already familiarize with it. Of course unlike the proposed API the polyfill would require extensive permissions.
Whether this is implemented or not, we need a way to quickly and dynamically update cosmetic and scriptlet filters when problems happen or ads are slipped on very popular sites like Youtube, Twitter, Facebook etc.. For example, Twitter has changed their domain to x.com
and while it's no problem at all to users of MV2-blocker, those of uBOL are suffering ads.
https://github.com/uBlockOrigin/uAssets/issues/23732#issuecomment-2117977871
@Yuki2718, thanks for flagging that. I definitely think we would want to support dynamic cosmetic rules in line with the dynamic ruleset support we have in the Declarative Net Request API.
Short term, do you know what options uBOL has tried in Manifest V3? For example, an option mentioned at the start of this issue is using messaging from the content script to the service worker to get additional rules. This can't be done synchronously, which is why I still think a new API would be helpful long term - however the injection in MV2 also wasn't fully synchronous and I suspect that at least for additional rules added dynamically (and in particular for modals like the x.com one that aren't present on page load) it may be sufficient.
Sorry IDK, @gorhill will know better. Sure, MV2-blocker also requires manual refreshing to apply updated filters, but anyway user don't need to wait for the update of extension itself, which is my main point.
DeclarativeCosmeticRules API proposal
Background
Cosmetic rules in content blockers
Cosmetic rules can be divided into three groups: element hiding rules, CSS rules, and scriptlets.
Element hiding rules can be used to hide various elements on web pages, such as advertisements, pop-ups, banners, and other unwanted content. By defining the CSS selectors of these elements, users can hide them from view for a more pleasant browsing experience. Technically, element hiding rules inject a CSS
display:none
style into the page for a given element.CSS rules can be used to add different styles to DOM elements. Technically, CSS rules inject a custom CSS style into the page. There are some restrictions on what styles can be injected, e.g. you cannot use a style that loads additional resources. For example,
url()
, etc.Scriptlets can be used to modify JS behavior, abort retrieval of some props, speed up timers, abort inline scripts, remove DOM element attributes or classes, etc. Technically, scriptlets change the behaviour of the page by executing small named JS functions that come with the extension. Example:
abort-property-read(propName)
.Main issues
Content blocking extensions require wide permissions, mostly to apply cosmetic rules
Timing. Content blocking extensions would like to apply cosmetic rules as quickly as possible, that is, before the page loads and page scripts start executing. With the current approach, there is a slight delay. It would be ideal if the new API applied the rules after merging the CSSDOM and DOM trees built and before the layout step.
How cosmetic rules are applied in MV2 and MV3?
MV2
The extension needs to inject scripts and styles as early as possible for a smoother user experience (e.g. blinking DOM elements). It also needs to patch scripts before websites can copy DOM API methods. This forced the extension to use a rather sophisticated way of injecting scripts and styles based on events thrown by the
webRequest
andwebNavigation
APIs. In short, atwebRequest.onHeadersReceived,
when the first information of the request is received, the extension asks the engine for the rules related to the current request and prepares styles and scripts to inject. As the engine is already running, this information can be obtained very quickly. AtwebRequest.onResponseStarted
, the extension tries to inject the scripts received in the previous step usingtabs.executeScript
. This event is not reliable, so atwebNavigation.onCommitted
the extension will inject scripts again if they weren't injected before. Along with the scripts, the extension will also inject CSS styles usingtabs.insertCSS
.So to inject cosmetic rules we have to ask for the next permissions:
tabs
-tabs.insertCSS
to insert styles andtabs.executeScript
to inject scriptswebRequest
- to listen for eventswebNavigation
- to listen for events<all_urls>
- because we need to inject scripts and styles into all pages And these permissions are pretty powerful.MV3
Extensions built on top of MV3 injects scripts using
scripting
api and content script for styles. To inject scripts extension subscribes to thewebNavigation.onCommitted
event and injects scripts when this event fires. To inject styles extension uses content script. The content script is injected into every page and requests for the styles from the background page via messaging.So to inject cosmetic rules we have to ask for the next permissions:
scripting
-scripting.executeScript
to inject scripts and scriptletswebNavigation
- to listen for events and inject scripts in time<all_urls>
- because we need to inject scripts and styles into all pagescontent_script
- not a permission, but a way to inject styles into the pageWhy not use a content script to inject the cosmetic rules?
In order to insert styles and scripts selectively, we need to launch the engine to search for the rules suitable for this website only. Launching the engine takes some time, if the engine is used in the content script it would be launched for each website separately. This would lead to significant performance degradation due to large script compilation containing a lot of rules. Alternatively, the engine could be launched in the background page or service worker, but this would still require time for messaging between the background page and the content script.
How many cosmetic rules are there?
Element hiding rules are one of the most popular rule types - for example, AdGuard's Base filter contains 98500 rules, 24800 of which are element hiding rules.
CSS rules and scriptlets are less common. However, they are still very popular among filter developers, especially in some difficult cases. Scriptlet rules make up 3000 rules and cosmetic CSS rules make up 1500 rules in the AdGuard Base filter.
Goal
MV3
One of the goals of MV3 is to make extensions have fewer permissions by default, and to make maximum permissions optional.
Proposal goal
The goal of this proposal is to make cosmetic rules declarative. This will allow us to remove the
tabs
andwebRequest
permissions from the extension manifest. This will also allow us to remove the<all_urls>
permission from the extension manifest. Finally, it would allow us not to inject content script into every page.To avoid reinventing the wheel, we took the Declarative Net Request API as an example, and tried to build logic on its likeness to take advantages of pre-built Declarative CSS rules.
And as a DNR API we need the ability to dynamically change these rules (https://github.com/w3c/webextensions/issues/162) - for CSS rules it's doubly important.
API
This section needs to be improved and expanded, but first we want to get feedback on the general idea.
API schema
Declarative element hiding rules
Here and below you will find some examples of its use.
See - https://adguard.com/kb/general/ad-filtering/create-own-filters/#cosmetic-elemhide-rules
Declarative css rules
See - https://adguard.com/kb/general/ad-filtering/create-own-filters/#cosmetic-css-rules
Declarative scriptlets rules
See - https://adguard.com/kb/general/ad-filtering/create-own-filters/#scriptlets
API to manage rules dynamically
// TODO