DandelionSprout / adfilt

The place where I, DandelionSprout, store my web filter lists for countless topics, including my Nordic adblock list. As simple as that, really.
Other
1.52k stars 150 forks source link

General filter chit-chat №2 #63

Closed DandelionSprout closed 1 year ago

DandelionSprout commented 4 years ago

Note, 13th of February 2023: Next de facto discussion place until further notice is at #779.

————————————

So today I learned that GitHub threads max out at 2,500 comments, despite nothing and no one ever telling me about that previously, let alone GitHub's help pages. So here's thread number 2!

This thread is a megathread about adblock discussions in general. Here one can request syntax help, reproduction confirmations, info about differences between adblockers, assistance with making new lists, and so on. They'll be answered or considered by the biggest Adfilt contributors, and occasionally by members of the uBlock Origin development team (although in an unofficial fashion). (This header section was last updated on the 24th of April 2019 by DandelionSprout.)

Same non-obligatory rules apply as in the previous thread (Rules can be amended by Dandelion Sprout on very short notice, but this happens pretty rarely):

THEtomaso commented 4 years ago

today I learned that GitHub threads max out at 2,500 comments

How are we expected to resolve any issues, with such a low limit!? :)

krystian3w commented 4 years ago

Github crash, I have many errors 500 on update files in online editor or send comments:

"You can't comment at this time. ".

obraz
obraz
THEtomaso commented 4 years ago

@DandelionSprout:

Ad labels between feed entries: https://sol.no/studio/sol-livestudio/388/

Possible fix: sol.no##[class^="AdPrefix"]

Yuki2718 commented 4 years ago

@THEtomaso Just a personal opinion about the matter you noted in the previous thread. Using social filter does not necessarily mean you wanna block anything social. I don't use Facebook and I subscribe AdGuard Social, but hope any of my filters never ever interfere with news article about Facebook which may or may not include links to FB, which unfortunately for me was the case for Fanboy Social - I don't report such one-time false positives but still they're annoying. I rather wonder why so many filters, not limited to social ones, prefer generic rules and don't make specific rules one-by-one - this way you can avoid most of false positives, and I believe it's more efficient in terms of performance unless the number of rules goes too many. I don't expect any filter to be perfect on blocking and kinda accept some false negatives, but single false positive is annoying enough despite I can fix them. I maintain my own private filters which currently include about 2,000 rules each for PC & mobile, most of them are specific rules and I make a rule generic only after I confirmed the pattern is found on many sites (usually 3-5+ sites). AdGuard also tends to use more specific rules compared to EasyList/Fanboy and even replaced some generic rules in EL with specific rules, which I think makes sense given most AG users are noobs. Well, one thing I'm not so happy is AG filters tend to rely too much on cosmetic filters which I don't see much value.

THEtomaso commented 4 years ago

I maintain my own private filters which currently include about 2,000 rules each

Getting close to 19.000 rules in my own filter, and particularly the social media entries (site-specific ones) has gotten way out of hand!

THEtomaso commented 4 years ago

@DandelionSprout:

chilimobil.no and prisjakt.no ads + empty ad boxes with labels:

https://www.tek.no/
https://www.tek.no/nyheter/nyhet/i/OpJ7nw/

Possible fix:

tek.no#?#.lp_article_content > div:-abp-contains(annonse)
tek.no#?#section > div > div:-abp-contains(annonse)
tek.no#?##application > div > main > div > div:-abp-contains(annonse)
THEtomaso commented 4 years ago

https://github.com/DandelionSprout/adfilt/commit/894b91522be99383146e5b3e43d125a9cfba0df9

The crap remains!:

filterissue-tek-1

filterissue-tek-2

filterissue-tek-3

DandelionSprout commented 4 years ago

I missed out on the Prisjakt ad the first time around. I don't consider the phone subscription comparison to be an ad at the time of writing, although I feel they could've shared more info about the comparison. The bottom ad placeholder will be fixed in Pale Moon once the :nth-ancestor fix goes live.

THEtomaso commented 4 years ago

Also, you missed the ad label in my last screenshot!

EDIT: ..and there's more Prisjakt stuff here!: https://www.tek.no/produkter/19266/

krystian3w commented 4 years ago

||i.imgur.com/eQ15Dp8.png$domain=bindingofisaac.fandom.com

DandelionSprout commented 4 years ago

(To THEtomaso) The ad label is removed by tek.no##div[id$=_body_ad]:nth-ancestor(1). I have a personal policy of not adding legacy-version-only fixes to my lists, so if you're unable to wait for uBO1.16.4.19, you can add tek.no##div[id$=_body_ad]:xpath(..) in the meantime.

I'll add an entry for the Prisjakt carousel in that link. The price tag of the page's main keyboard won't be removed, as it's convenient to know how much a product costs in Norway.

(To krystian3w) Presuming this was meant for the Anti-'Custom cursors' List, I thank you greatly for that tip. I'll add an entry for it pretty soon.

THEtomaso commented 4 years ago

if you're unable to wait for uBO1.16.4.19, you can add tek.no##div[id$=_body_ad]:xpath(..) in the meantime.

No need. The rules in my initial report takes care of everyting.

DandelionSprout commented 4 years ago

I suppose it does, now that I tested them out... Give me some seconds.

Yuki2718 commented 4 years ago

I maintain my own private filters which currently include about 2,000 rules each

Getting close to 19.000 rules in my own filter, and particularly the social media entries (site-specific ones) has gotten way out of hand!

It may be time for you to build your empire ;) I'm speaking on users' side, particularly when it comes to social staff false negatives are not big deal. I basically ignore those social buttons unless either I can block them all by 1 or 2 blocking rules alone (no cosmetic rule), or they're annoying enough (e.g. floating buttons), but one FP is worse than 20 FNs at least for me - maybe it's only me. Having looked various filters, I see one tendency that newer filter maintainers tend to use more specific rules. I guess that's reasonable from the current user base perspective. Also If one generic rule replaced 5 specific rules but required 4 exception rules to be added, it doesn't make much sense.

DandelionSprout commented 4 years ago

I for one felt I had to create my own social list that solely removed sharing buttons, as I felt that both AdGuard's and Fanboy's social lists were far too broadly blocking.

And sadly, social media stuff is very difficult to handle with adblock lists. Even though I use very broad hiding rules in that list, around ¼ of all websites I come across still slip through it.

THEtomaso commented 4 years ago

@DandelionSprout:

Deblurrer for elbil24.no: elbil24.no##.CTA-body-faded

Example: https://www.elbil24.no/71205244/

DandelionSprout commented 4 years ago

PSA (although not a critical one): If you ever see any filterlists use $reload, it's probably a pretty fraudulent list, especially considering it's only believed to be supported by extremely questionable PR-Chinese browser extensions.

From what I could understand from https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.huorong.cn%2Finfo%2F1582284212427.html, the ChinaList 2.0 list was caught red-handed trying to earn affiliation rewards from extension users, and has now been emptied as a result of sheer shame.

Yuki2718 commented 4 years ago

PSA (although not a critical one): If you ever see any filterlists use $reload, it's probably a pretty fraudulent list, especially considering it's only believed to be supported by extremely questionable PR-Chinese browser extensions.

From what I could understand from https://translate.google.com/translate?sl=auto&tl=en&u=https%3A%2F%2Fwww.huorong.cn%2Finfo%2F1582284212427.html, the ChinaList 2.0 list was caught red-handed trying to earn affiliation rewards from extension users, and has now been emptied as a result of sheer shame.

Are you aware of any list that actually uses that strange modifier?

DandelionSprout commented 4 years ago

My personal but unclear understanding is that ChinaList 2.0 (Here's a Wayback Machine entry that includes the entries) was the only list to ever have used it, thankfully.

But although I look through many lists frequently as a hobby, who knows if there's something else out there that has or will manage to slide by behind our backs.

krystian3w commented 4 years ago

Only modified uBO / AG / AdBlock (or ABP)?

DandelionSprout commented 4 years ago

Seems like it.

Apparently it was an extension by the name of 广告净化器 (Its official English name, if any, is hard to figure out) that had already managed to get blacklisted by QQ Browser's add-on store as early as 2015, but which still kept on existing and such.

Yuki2718 commented 4 years ago

Seems like it.

Apparently it was an extension by the name of 广告净化器 (Its official English name, if any, is hard to figure out) that had already managed to get blacklisted by QQ Browser's add-on store as early as 2015, but which still kept on existing and such.

Yeah, "Ad purifier" is obviously mistranslation. It actually means, in my translation, "Ad cleaner" or "Ad clearer". EDIT: whoops, "Ad Clearner" was already in the article, as with "Ad purifier".

DandelionSprout commented 4 years ago

I can't seem to find any info on $redirect-rule, as seen in various recently added uBlock Filters entries. How exactly does it differ from $redirect?

gwarser commented 4 years ago

redirect-rule transforms classic blocking filter to redirect filters - https://github.com/uBlockOrigin/uBlock-issues/wiki/Static-filter-syntax#redirect-rule, https://github.com/uBlockOrigin/uBlock-issues/issues/310

krystian3w commented 4 years ago

The filter can complement the blocking with EasyList / EasyPrivacy with redirecting, without tracking as EasyList / EasyPrivacy cuts out the advertisement / script.

Yuki2718 commented 4 years ago

Sorry for a silly question, but this explanation of the new remove() syntax, "it must only be used as a trailing operator", is beyond my English skills (I looked up dictionaries and searched on Internet) and confuses me. Does it just mean it should not be used unless necessary, or?

DandelionSprout commented 4 years ago

The uBO wiki tends to be worded in pretty difficult terms, and the AdGuard syntax guide is only slightly easier to understand.

Given the context of https://github.com/gorhill/uBlock/wiki/Procedural-cosmetic-filters#subjectremove, my understanding is that it must be the last : value to be used in an entry. So the plain meaning of it, would be that you can do example.org##a:after:remove(.element), but you can not do example.org##a:remove(.element):after.

krystian3w commented 4 years ago

bad...

In the example, it is clear that we do not paste anything into parentheses (brackets).

AG syntax pretending to be CSS:

gorhill.github.io#$?##pcf #a18 .fail { remove: true; }

there's no html "selector" in the buckles.

DandelionSprout commented 4 years ago

Seems like I'll have to test out :remove() in actual use later tonight, then. So we get to figure out how it actually works.

Yuki2718 commented 4 years ago

Given the context of https://github.com/gorhill/uBlock/wiki/Procedural-cosmetic-filters#subjectremove, my understanding is that it must be the last : value to be used in an entry. So the plain meaning of it, would be that you can do example.org##a:after:remove(.element), but you can not do example.org##a:remove(.element):after.

Hmmm, that makes sense, ty! Waiting your test ; )

bad...

In the example, it is clear that we do not paste anything into parentheses (brackets).

AG syntax pretending to be CSS:

gorhill.github.io#$?##pcf #a18 .fail { remove: true; }

there's no html "selector" in the buckles.

Are that spaced selectors just an usual CSS selector for descendants (i.e. similar to ">" but not limited to direct children)? And too bad, AdGuard KB doesn't state anything about the syntax despite is has already been used in their filters.

gwarser commented 4 years ago

my understanding is that it must be the last : value to be used in an entry. So the plain meaning of it, would be that you can do example.org##a:after:remove(), but you can not do example.org##a:remove():after.

I slightly corrected it, but right (no selectors in parentheses)

In the example, it is clear that we do not paste anything into parentheses (brackets).

Right.

Are that spaced selectors just an usual CSS selector for descendants (i.e. similar to ">" but not limited to direct children)?

All "combinators" explained: https://developer.mozilla.org/en-US/docs/Learn/CSS/Building_blocks/Selectors/Combinators

> - direct child (space) - child, grandchild, grand grandchild ... etc.

gwarser commented 4 years ago

:remove() is like instruction, what to do with elements indicated by filter.

example.com##.foo:remove() - will remove all elements with foo class.

example.com##.foo, .bar, #baz:remove() - will remove elements with foo class, bar class and baz ID.

gwarser commented 4 years ago

The uBO wiki tends to be worded in pretty difficult terms

If only someone more technical will said this I will respond "go fix it" :smile:

(I have mechanical education, I'm not very good in writing even in my native language, I've never had proper English lesson :smiley: )

Slightly better description in draft: https://github.com/uBlockOrigin/uBlock-issues/wiki/Procedural-cosmetic-filters#subjectremove

Yuki2718 commented 4 years ago

@gwarser thx for clearing things up, I somehow thought a completely different syntax may be applied to remove() as it's called operator, but your example of :hide() cleared my confusion. Coincidently, I've found foreinaffairs.com actually uses remove() to self-delete ads if it detected ad-blocker: https://www.foreignaffairs.com/sites/default/files/js/js_WuaCj4HWa4O0f3e96aYGeMFCKjVyIWzlWk6k4cI4fIE.js

I can't believe you're not native-English, the wiki is overall well-written - it's just that non-techies like me also rely on the wiki and sometimes get confused, but there are places like here and r/uBlockOrigine and I'm very thankful to all you guys for these help.

gwarser commented 4 years ago

To be honest, most of the text is copied from commit messages, and I'm doing updates only for few months. Most of the text is written by gorhill.


https://github.com/gwarser/uBlock-wiki-draft-/graphs/contributors

DandelionSprout commented 4 years ago

I've become aware of some good news in these trying times: MVPS HOSTS finally supports HTTPS after more than 20 years!

It's also time for me to admit that I didn't exactly get a good impression of its maintainer when I first contacted him about HTTPS in what I think was late 2018-ish, and in his replies he'd end almost every single sentence with ellipses. 😬

Though I learned today that he's recently been struggling with complete kidney failure and was (seemingly) successfully operated for it, so I have to give him a lot of slack due to that. And of course also applaud the successful operation and such.

THEtomaso commented 4 years ago

@DandelionSprout:

Issue:

Paywall @ budstikka.no.

How to reproduce:

Try to read a couple of articles, and the paywall appears!

Possible fix (for the 'Browse websites without logging in' filter):

budstikka.no##.paywall
budstikka.no##body:style(overflow: auto !important;)
krystian3w commented 4 years ago

Or remove node (legacy need Nano Defender scriplet):

budstikka.no##+js(nano-remove-elements-onready.js, .paywall)
budstikka.no##.paywall:remove()
budstikka.no#$?#.paywall { remove: true; }
DandelionSprout commented 4 years ago

I think Budstikka must've changed their article payment system dramatically in the 3 hours since you guys posted your suggestions; as they're now dividing their articles into perma-free and perma-paid articles as far as I can see. Even visiting 10 free articles doesn't prompt a paywall on them.

budstikka.no##.paywall works to remove the overlays on the perma-paid articles marked with +, but budstikka.no##body:style(overflow: auto !important;) has no effect on anything that I'm aware of.

krystian3w commented 4 years ago

budstikka.no##body:style(overflow: auto !important;) has no effect on anything that I'm aware of.

Don't flip monitor:

obraz

Or this is lack of userCSS in Nano Adblocker for Chromium: https://github.com/NanoAdblocker/NanoCore/issues/243

DandelionSprout commented 4 years ago

Even when considering the exceptionally ridiculous image, and that I do indeed use my monitor in portrait mode, I simply don't think there's any hidden text in the perma-paid articles that could've been revealed with budstikka.no##body:style(overflow: auto !important;).

There could have been such text in limited-free articles, but I can't see that they're using limited-free articles on my end.

THEtomaso commented 4 years ago

I think Budstikka must've changed their article payment system dramatically in the 3 hours since you guys posted your suggestions

Yeah, earlier today, I was only able to read one free article, before their paywall popped up! Looks like they're back to normal now, allowing users to read an unlimited amount of free articles.

--

budstikka.no##body:style(overflow: auto !important;) has no effect on anything that I'm aware of.

It enables the scroll bar, after killing the paywall. It still works for + articles, although it's completely pointless, of course.

THEtomaso commented 4 years ago

Hope this isn't an example of webmasters testing things out, and thereby accidentally giving us a taste of things to come. Those type of subscription systems are something that I really don't want to see on Norwegian news sites!

krystian3w commented 4 years ago

Maybe a litte better use css class intead of not modified body "node":

budstikka.no##.no-scroll:style(overflow: auto !important;)


budstikka.no##head:not(:has(> meta[property="lp:paywall"][content="hard"])) ~ body .paywall
budstikka.no##.no-scroll:style(overflow: auto !important;)
liamengland1 commented 4 years ago

Example: https://www.budstikka.no/e-18/stortinget-krever-e-18-losning-innen-paske-hele-planen-er-i-fare/574401!/ is perma-paid as distinguished by the <meta property="lp:paywall" content="hard"> tag in the source code.

krystian3w commented 4 years ago

But page have 15K pixels in "height".

liamengland1 commented 4 years ago

Bizarre: https://github.com/DandelionSprout/adfilt/issues/7#issuecomment-600838086

DandelionSprout commented 4 years ago

I think https://github.com/easylist/easylist/issues/5112 may be of interest as a possible ASAP hotfix addition to the uAssets Unbreak list, considering the problem entry in that issue report seems to be breaking all YT embeds whatsoever.

THEtomaso commented 4 years ago

@DandelionSprout:

Site: ny1.no

Issue: Ads

Fixes: ny1.no##.header-ainfo + ny1.no##[id^="supermag_ad-"] and/or ny1.no##.widget_supermag_ad

gwarser commented 4 years ago

Proposition from ryanbr:

"List author slack channel?"

Hey,

(not sure if this the correct way to bring this up)

Just a thought whether a Slack channel could be an option for list authors / friends of uBlock/ABP/other Adblock extensions. For easier interaction/collaboration between authors, this isn't replacing Github but helping 'collab between list authors.

Won't be a public slack channel, open only on invite.

Its just a thought.

https://github.com/uBlockOrigin/uBlock-issues/issues/956