Closed peace2000 closed 3 weeks ago
@felix-22 thanks for links!
Yeah adding :not(body):not(html)
after each generic entry systemically, would pre-emptively help and would completely solve this issue, I even suggested that same thing recently: https://github.com/easylist/easylist/issues/8367 though it would bloat filterlist a bit. I don't know about performance impact. But encountering this same issue over and over again (hundreds of :not(html)
and :not(body)
exceptions) feels unnecessary as it would be possible to do pre-emptive measures.
The best thing to do is to adjust blockers not to block body
and html
elements. The question is, could all major blockers work together to achieve that?
The best thing to do is to adjust blockers not to block
body
andhtml
elements. The question is, could all major blockers work together to achieve that?
I hope so:
https://github.com/AdguardTeam/AdguardBrowserExtension/issues/1845 https://gitlab.com/eyeo/adblockplus/abc/adblockpluscore/-/issues/361
The best thing to do is to adjust blockers not to block
body
andhtml
elements.
Majority of CSS selector-based cosmetic filters are enforced through user styles, this means internally each of those filters would need to be suffixed with :not(html):not(body)
, while ensuring they are still properly reported in the logger. It's not trivial.
What if this is taken care of at the DOM surveyor level?
- pendingNodes.add(document.querySelectorAll('[id],[class]'));
+ pendingNodes.add(document.querySelectorAll(':not(html):not(body)[id],:not(html):not(body)[class]'));
I'm guessing that this would cause issues for selectors containing descendant and child combinators.
If this were to be implemented, it still should be possible to be able e.g. to target html
or body
via different styling rules or via scriplets, the only thing that should be prevented is using display: none !important
to them.
it still should be possible to be able e.g. to target
html
orbody
via different styling rules or via scriplets
The way I see it the automatic exclusion of html
/body
should apply only to generic cosmetic filters which are made of a single class or id identifier, so this means you would still be able to target html
or body
by using these explicitly.
I'm guessing that this would cause issues for selectors containing descendant and child combinators.
This is an interesting idea. The issue of combinators maybe could be solved by having a separate reporting for ids/classes taken directly from html
/body
elements.
I'm guessing that this would cause issues for selectors containing descendant and child combinators.
To clarify, let's say there's a document like this:
<body>
<article></article>
</body>
And a filter like ##.page-body > article
.
The document uses JS to add the page-body
class to the body
element.
If the DOM surveyor ignores the body
element, the filter will never work?
If the DOM surveyor ignores the
body
element, the filter will never work?
When I say "separate reporting", I mean the ids/classes of html
/body
elements would be reported in special properties (not through these), and treated differently in the background process -- for these uBO would only look-up the complex set, and thus would be able to find and apply something like .page-body > article
.
which are made of a single class or id identifier
There are some generics that have multiple values (samples from Fanboy's Annoyances):
##.js-stickyFooter.u-bottom0.u-fixed
##.u-zIndexMetabar.u-fixed
##.article-section > .ui-button-close
###coiOverlay[role="banner"][style*="flex"]
Though I agree that they are not very common.
Installed 1.37.3b15 and tested with the site (webshop.elektroskandia.no
) and with the sample rule I gave (##.consent-summary-shown
), works fine, thanks!
works fine, thanks
Thanks to @mjethani, this was a good idea to look at the content script angle -- until then I was looking at the background process angle which required a whole lot of changes -- it turns out the changes ended up being relatively trivial.
Fanboy's Annoyances
If I enable that list with the generic filter you posted, the issue can be repro'd again..
Yep, strange... I can reproduce too.
If both of these:
##.consent-summary-shown
##.consent-summary-shown:not(body)
Are present, the site goes blank.
Investigating.
Ok, I figured why this happens, investigating best fix.
Tested v.16, works without an issue!
Wait...
This form still blanks the site:
##[class*="consent-summary-shown"]
Imo it would be safer to ignore also rules that match to the html
or body
and have either of these selectors: *
, ^
, $
, ~
, |
.
Sample rules from Fanboy's Annoyances:
##div[class*="NewsletterSubscribe_"]
##div[class*="newsletter-signup_"]
##div[class^="BackToTop_"]
##div[class^="backToTop_"]
Only simple #id
and .class
are ignored.
It would make sense to also prevent highly generic simple cosmetic filters from affecting body
/html
. These filters are not a result of surveying though, so another approach to fix this is required.
Would there be any perf penalty for not applying to body/html?
Would there be any perf penalty for not applying to body/html?
Only benchmarking/measuring can tell, and various blockers may be affected differently, i.e. uBO unconditionally injects only a small subset of all generic filters (the "highly" generic ones), while ABP unconditionally injects all of them.
Would there be any perf penalty for not applying to body/html?
Regarding the current solution, there are at least three ways to do it:
:not(html):not(body)
suffix (current)body
prefixhtml
and body
nodes in JS Use
body
prefix
@mjethani Some websites inject stuff between head
and body
, so #2 would not be a perfect solution.
… ABP unconditionally injects all of them.
Aside from higher CPU usage, this also causes a very high amount of memory usage. The last time I checked, using ABP causes your system to use more memory than if you didn't use an ad blocker. :not(html):not(body)
might not be practical over there.
Use
body
prefix
Filter list author would disagree with this, there are cases of sites injecting elements as child of html
.
Filter out
html
andbody
nodes in JS
Not sure what you mean, the rules are injected as user styles, JS is of no use in such case. I did consider having user styles to unhide body
/html
, but this does not really work as the blocker can't know which display
style to assign the body
element, a site could well use flex
as value.
As far as uBO is concerned, :not(html):not(body)
is the only valid solution for highly generic cosmetic filters, and would be quite trivial if there was no logger or DOM inspector.
I know roughly the path to implement this, but for the second part of dealing with highly generic cosmetic filters, this will have to wait for the next dev cycle. At least for now the most likely to cause false positives are being mitigated.
The last time I checked, using ABP causes your system to use more memory than if you didn't use an ad blocker.
Right, I tweeted about this a few months ago: https://twitter.com/gorhill/status/1345042527171325953.
I do not worry using :not(html):not(body)
in uBO since it does not unconditionally inject all generic cosmetic filters. In any case, only benchmarking/measuring once the changes are made will tell whether this is an issue.
Filter out
html
andbody
nodes in JSNot sure what you mean, the rules are injected as user styles, JS is of no use in such case.
I was referring to the current solution with querySelectorAll()
. It's possible to get all the nodes and then filter the result in JS. But it's unlikely that this would perform better than :not(html):not(body)
.
In any case, only benchmarking/measuring once the changes are made will tell whether this is an issue.
+1
Using querySelectorAll()
with highly generic cosmetic filters has caused performance issue in the past, see https://github.com/uBlockOrigin/uBlock-issues/issues/756, this is why the hidden element badge count in the popup panel executes only on-demand.
If the DOM surveyor ignores the
body
element, the filter will never work?
This is what I meant:
<html>
<body>
<div class="bar">
Hello.
</div>
<script>
setTimeout(() => {
document.body.className = 'foo';
},
3000);
</script>
</body>
</html>
Now I've added the filter ##.foo .bar
and it never gets applied. This is because body
is being ignored entirely, even though the target of the filter is the div
element.
The filter ##.foo .bar
should work, right?
I did consider having user styles to unhide
body
/html
, but this does not really work as the blocker can't know whichdisplay
style to assign thebody
element, a site could well useflex
as value.
I thought about this too. Aside from not knowing what value to use for the display
property, it's also that a site might want to hide the body
element for whatever reason, and this would force it to be shown at all times. Because a user rule with !important
would always override any author rule.
@mjethani Some websites inject stuff between
head
andbody
, so #2 would not be a perfect solution.
@kiboke if it's really outside body
it wouldn't be visible, as far as I can tell.
<html>
<body>
Body.
</body>
<script>
setTimeout(() => {
let e = document.createElement('p');
e.innerText = 'Hi!';
document.documentElement.appendChild(e);
},
1000);
</script>
</html>
@kiboke if it's really outside
body
it wouldn't be visible, as far as I can tell.
@mjethani I just tried the code in Firefox and I do see Hi!
The filter
##.foo .bar
should work, right?
Yes, but this is an edge case for uBO because its mutation observer does not listen to attribute changes. This is why :watch-attr()
exists, to solve such case:
...##body.foo:watch-attr(class) .bar
I looked at revert
too.
Suppose this is the user style sheet:
.consent-summary-shown {
display: none !important;
}
body.consent-summary-shown {
display: revert !important;
}
I suspect that revert !important
still overrides any value in the author style sheet.
It's also not clear how this would work with multiple user style sheets.
would always override any author rule.
Then webmasters maybe start promote legacy userContent.css
- win with addon API.
still overrides any value in the author style sheet.
IMO this is better implementation that oldie auto
/ initial
. If I try reset cursor: revert
I see useage CSS embeded into "browser" after hover on many elements instead photo or webaster idea for cursor.
So can break flexbox/grid pages too.
Default CSS Settings
Most browsers will display the
<body>
element with the following default values:body { display: block; margin: 8px; } body:focus { outline: none; }
This is why
:watch-attr()
exists, to solve such case
this explains part of the situations when on my PC the filter didn't hide the <body>
and on gwaser PC it always did and then when on his it never hid and on mine it almost always in 2018-2020.
Maybe * aka all URL-s
also need similar disable breakage.
Now possible use *
to avoid create specific filter with :watch-attr()
if someone must check ID/Classes/attribs injected very late into dom tree/main nodes.
E.g.:
https://czyodebrac.pl/co-to-za-numer-dzwonil/2147483647/
on page no longer works these generic filter:
##.cli-modal-open #cookie-law-info-bar
#cookie-law-info-bar
is not body/html tag but "<div>
".
or have very Race Conditon in uBO 1.37.3rc0, but 1.37.2 injected filter almost immediately.
High profile case involving highly generic cosmetic filter: https://github.com/uBlockOrigin/uAssets/issues/11244.
Highly generic cosmetic filters should not longer affect html
/body
elements. To confirm, I used the original steps-to-reproduce, except with the following filter:
##[class*="consent-summary-shown"]
I would propose to add article tag to html and body. Should I open a new issue?
Should I open a new issue?
No, doing what you suggest would just allow advertisers to use article
to bypass cosmetic filters -- there is no restriction on the number of article
tags in a page.
Prerequisites
I tried to reproduce the issue when...
Description
There have been numerous occasions where Annoyance lists have blanked out websites because either
html
orbody
have contained values that have been matched by a generic cosmetic filter. This is a pretty common issue for Annoyance lists, though there have been a few occasions in normal adblocking as well.Usually these issues have been fixed by adding
:not(html)
or:not(body)
exception to the problematic generic filter.I was wondering if it would be reasonable to add a safeguard measure to uBO: to prevent it from applying cosmetic filters, that have a match in
html
orbody
in websites where an user is visiting. I think that neitherhtml
orbody
should ever be blocked as that will result in a blank website.One recent issue: https://github.com/easylist/easylist/pull/8431 -
https://webshop.elektroskandia.no/
was blanked out becausebody
in that website had a value of.consent-summary-shown
. It was matched by this generic GDPR banner filter:##.consent-summary-shown
. (It was later fixed by adding an exception:##.consent-summary-shown:not(body)
).But that wasn't the only case. In Fanboy's Annoyance list, there are currently:
:not(html)
exceptions:not(body)
exceptionsAdguard Annoyance:
:not(html)
exceptions:not(body)
exceptionsEasylist:
:not(html)
exceptions:not(body)
exceptionsI know these website blanking issues are mainly related to Annoyance lists that are not turned on by default in uBO, but they are still available and people use them. Not all issues get reported to filter list maintainers and there could be many unreported issues relating to these lists. Each
:not(html)
or:not(body)
exception that currently exists, are related to fixing blank websites.A specific URL where the issue occurs
https://webshop.elektroskandia.no/
(fixed now but this one is a recent case so I'll use it as a sample)Steps to Reproduce
##.consent-summary-shown
to custom filtershttps://webshop.elektroskandia.no/
Expected behavior
uBO would ignore this generic filter for this website, because it matches to the
body
element.Actual behavior
Website is blanked, due to a match to the
body
element.uBlock Origin version
1.37.3b13
Browser name and version
Firefox 91.0.1
Operating System and version
Windows 10