StevenBlack / hosts

🔒 Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
26.6k stars 2.21k forks source link

New source: Anti popads.net #283

Closed betterwebleon closed 2 years ago

betterwebleon commented 7 years ago

Hello, I've found a new data source on the website FilterLists, where it scores highly.

It is called "Anti popads.net". Raw file is here. Currently it has more than 970 entries and is updated frequently. Description: "Blocks shady, annoying pop-under ads from the infamous PopAds ad network."

Could you add it among sources for your unified hosts file?

StevenBlack commented 7 years ago

@betterwebleon sorry for the delay. Hey thanks for pointing me to FilterLists; that's cool! I like how they present the sources and their contact facets.

I'm Currently traveling. I plan to revisit this very soon. Thanks for the suggestion.

betterwebleon commented 7 years ago

No problem, there's no hurry. Thank you for considering it!

FilterLists is comprehensive and well structured website indeed. I found it just recently, thanks to @gorhill from his wiki page about custom filter lists for uBlock Origin (his content blocker).

Veticia commented 7 years ago

Hi, Yhonay here. I've added a list in a form of a hosts file to my git repo. I don't know how usefull it will be seeing how often popads.net adds new domains to their repertoire, but if you'll find a use for it then here it is.

StevenBlack commented 7 years ago

Hi @Yhonay! How are you sourcing the domains in antipopads? What's your process to find these?

Veticia commented 7 years ago

Someone linked me to a list of a few thousands sites using popads.net. I pick a few dozen at random, unobfuscate everything between and try to download any found .js file. Then I check if it matches a few keywords found in a standard popads.net .js file. If it matches then a domain that .js was downloaded from is added to the list.

I know it's a little naive but so far it works.

FadeMind commented 6 years ago

@StevenBlack @betterwebleon https://github.com/FadeMind/hosts.extras/commit/6ad79e7e2de999bea0eb26bf9bf69115e074c1d2

dnmTX commented 6 years ago

Finally something for them popunders. BIG THANKS FROM ME TOO !!!!

XhmikosR commented 4 years ago

Has there been any progress about this? I see the original source is still active and going 🙂

MrRawes commented 2 years ago

it seems like "Anti popads.net" is a dead list

bigdargon commented 2 years ago

New repo anti-popads https://github.com/AdroitAdorKhan/antipopads-re

FadeMind commented 2 years ago

@StevenBlack PR awaits.

CC @AdroitAdorKhan

StevenBlack commented 2 years ago

Using ghosts to evaluate: this would add (27,306 - 794) = 26,512) domains to our base list, a relative increase of 26,512 / 110,297 = 24%.

ghosts -c https://raw.githubusercontent.com/AdroitAdorKhan/antipopads-re/master/formats/hosts.txt     11:21:51
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
Domains: 110,297
Bytes: 3.5 MB
----------------------------------------
Compared hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/AdroitAdorKhan/antipopads-re/master/formats/hosts.txt
Domains: 27,306
Bytes: 690 kB
Intersection: 794 domains

Looking at TLD:

ghosts -tld -m https://raw.githubusercontent.com/AdroitAdorKhan/antipopads-re/master/formats/hosts.txt
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/AdroitAdorKhan/antipopads-re/master/formats/hosts.txt
Domains: 27,306
Bytes: 690 kB
TLD tally:  (130 unique TLD)
   com: 18,160
   top: 2,937
   xyz: 2,528
   net: 1,072
   club: 303
   info: 271
   pro: 261
   biz: 161
   site: 140
   cam: 133
   fun: 116
   me: 107
   casa: 101
   co: 96
   online: 86
   ru: 83
   life: 76
   org: 72
   pw: 36
   am: 34
   icu: 33
   one: 31
   io: 30
   space: 30
   website: 26
   name: 22
   work: 21
   gq: 16
   de: 16
   uk: 15
   link: 14
   click: 12
   tech: 11
   media: 11
   mobi: 11
   bid: 11
   cc: 11
   in: 9
   live: 9
   eu: 9
   network: 8
   tk: 7
   ml: 7
   us: 6
   nl: 6
   ga: 5
   rocks: 5
   cf: 5
   tv: 5
   best: 4
   app: 4
   ai: 4
   su: 3
   fr: 3
   to: 3
   win: 3
   cyou: 3
   sh: 3
   world: 3
   today: 3
   pl: 3
   digital: 2
   lv: 2
   ad: 2
   dev: 2
   bar: 2
   delivery: 2
   vip: 2
   house: 2
   vn: 2
   ph: 2
   sbs: 2
   au: 2
   dk: 2
   cloud: 2
   codes: 2
   by: 2
   trade: 2
   pk: 2
   za: 2
   jp: 2
   webcam: 2
   ca: 2
   wtf: 2
   my: 1
   nz: 1
   news: 1
   center: 1
   ag: 1
   agency: 1
   loan: 1
   ooo: 1
   st: 1
   ae: 1
   fm: 1
   gg: 1
   review: 1
   cfd: 1
   ws: 1
   zone: 1
   stream: 1
   zw: 1
   buzz: 1
   gt: 1
   supply: 1
   kr: 1
   se: 1
   services: 1
   download: 1
   support: 1
   il: 1
   systems: 1
   email: 1
   gs: 1
   ovh: 1
   land: 1
   global: 1
   at: 1
   uno: 1
   red: 1
   cn: 1
   glass: 1
   it: 1
   racing: 1
   is: 1
   men: 1
   marketing: 1
   re: 1
   ie: 1
   mx: 1
----------------------------------------
StevenBlack commented 2 years ago

Here is the main reason I'm tending towards declining this.

Who can explain this? 🤷🏻

We have several curated sources that watch this stuff. Many eyes watching. Over a long period of time. Some of our source lists are extremely diligent.

Now we see this proposed source where only 794 domains among 27,306 in its collection that overlap with our existing collection. Only (794 / 27,306) = 3% of the 27,306 domains overlap with all our other curated sources combined.

I would expect, what, 20% to 100% overlap? Something in that range.

Very suspicious.

Can someone make the case why bloating our list by 24% makes it a better list, nevermind a 24% better list?

bigdargon commented 2 years ago

I don't know how the repo managed to collect the popup domains pretty well. Instead of me going to a website, a popup ad pops up, I review the domain log and block it.

Meaning I only collected a few domains, but here they collect so many with automatic updates. I often use it in piholes and filters in the browser. Wonderful!

StevenBlack commented 2 years ago

Hi @bigdargon I understand.

I suppose I should clarify: I'm not hearing about popup issues from people using our base list.

Now you understand, if there was a lot of popup spam happening, I would hear about it. I'm not hearing about it.

So this 24% jack in our base list size feels like a solution in search of a problem? Does that make sense?

I'm not saying there's no popup problem, I'm saying I'm not hearing about popup problems. I'm not hearing demand from userland, nor from anywhere else.

StevenBlack commented 2 years ago

I've decided to decline adding this list.

Closing.