StevenBlack / hosts

πŸ”’ Consolidating and extending hosts files from several well-curated sources. Optionally pick extensions for porn, social media, and other categories.
MIT License
26.45k stars 2.19k forks source link

Add following community edited porn blocking HOSTS file as a source #1807

Closed robertvargic closed 2 years ago

robertvargic commented 2 years ago

Here's the additional hosts file related to porn site blocking:

https://github.com/4skinSkywalker/Anti-Porn-HOSTS-File

welcome[bot] commented 2 years ago

Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!

StevenBlack commented 2 years ago

Thank you for this Robert @robertvargic.

Using ghosts as a preliminary check.

Comparing to our main list β€” which doesn't include porn β€” the 41,806 domains of this lists intersects at a rate of 13% which seems high.

$ ghosts --clip
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
Domains: 93,179
Bytes: 2.8 MB
----------------------------------------
Compared hosts from clipboard summary:
----------------------------------------
Location: clipboard
Domains: 41,806
Bytes: 1.3 MB
Intersection: 5,530 domains

Comparing the suggested list with our porn list, its 41,806 domains intersect at a rate of 34%, which seems crazy low.

$ ghosts -m p --clip
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/porn/hosts
Domains: 133,610
Bytes: 4.0 MB
----------------------------------------
Compared hosts from clipboard summary:
----------------------------------------
Location: clipboard
Domains: 41,806
Bytes: 1.3 MB
Intersection: 14,428 domains
StevenBlack commented 2 years ago

Robert @robertvargic here's the TLD breakdown of the suggested list. I often find glaring mistakes at the bottom of this list. Only a few errors are evident in this case.

$ ghosts -m https://raw.githubusercontent.com/4skinSkywalker/Anti-Porn-HOSTS-File/master/HOSTS.txt --tld
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/4skinSkywalker/Anti-Porn-HOSTS-File/master/HOSTS.txt
Domains: 41,806
Bytes: 1.3 MB
TLD tally:  (256 unique TLD)
   com: 23,965
   net: 3,381
   pl: 1,731
   ru: 1,650
   de: 1,443
   org: 898
   info: 548
   me: 309
   tv: 308
   nl: 299
   uk: 287
   it: 284
   pro: 284
   biz: 273
   co: 271
   xxx: 270
   fr: 238
   cz: 228
   eu: 228
   br: 207
   us: 204

   ...

   pf: 2
   lc: 2
   store: 2
   pa: 2
   rodeo: 2
   bm: 2
   lgbt: 2
   center: 2
   tz: 2
   watch: 2
   services: 2
   hosting: 2
   atabout: 2
   monster: 2
   land: 2
   ms: 2
   faith: 2
   technology: 2
   kg: 2
   men: 2
   global: 2
   cr: 2
   pink: 2
   love: 2
   travel: 2
   ae: 2
   cf: 2
   ist: 2
   siteen: 2
   ag: 2
   media: 2
   ma: 2
   gallery: 2
   film: 2
   al: 2
   wine: 2
   vin: 2
   sx: 2
   lu: 2
   irish: 2
   work: 2
   racing: 2
   green: 2
   je: 2
   tt: 2
   gov: 2
   skceske-porno: 2
   cloud: 2
   ac: 2
   team: 2
   ke: 2
   tips: 2
   loan: 2
   az: 2
   trade: 2
   ooo: 1
   vet: 1
   fund: 1
   click: 1
   dev: 1
   pet: 1
   buzz: 1
   wales: 1
   credit: 1
   sale: 1
   cool: 1
   shop: 1
   adult: 1
   supply: 1
   bingo: 1
   fyi: 1
   press: 1
   bar: 1
   care: 1
   mm: 1
   gratis: 1
   pictures: 1
   gd: 1
   surf: 1
----------------------------------------
StevenBlack commented 2 years ago

Robert checking now the root domain tally. This is a beta feature of ghosts.

Looks generally OK.

$ ghosts -m https://raw.githubusercontent.com/4skinSkywalker/Anti-Porn-HOSTS-File/master/HOSTS.txt --root

----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/4skinSkywalker/Anti-Porn-HOSTS-File/master/HOSTS.txt
Domains: 41,806
Bytes: 1.3 MB
Root domain tally:  (18,138 unique root domais)
   tumblr.com: 1,675
   2o7.net: 832
   blogspot.com: 324
   gemius.pl: 292
   sms13.de: 272
   co.uk: 267
   cnzz.com: 178
   com.br: 177
   openx.net: 152
   omtrdc.net: 126
   intellitxt.com: 122
   adocean.pl: 102
   adtech.us: 100
   com.au: 90
   com.ua: 86
   popunder.ru: 80
   247realmedia.com: 76
   hotlog.ru: 76
   msn.com: 72
   hemnes.win: 68
   co.il: 68
   xhamster.com: 63
   com.pl: 62
   datasecu.download: 60
   hut1.ru: 60
   adultsites.co: 50
   bravenet.com: 46
   com.ar: 45
   co.za: 44
   bongacams.com: 42
   spylog.com: 40
   crypto-webminer.com: 40
   ajplugins.com: 40
   csgocpu.com: 38
   chaturbate.com: 36
   spankbang.com: 35
   wordpress.com: 32
   cpufan.club: 32
   co.id: 28
   youporn.com: 28
   crypto-loot.com: 28
   be.tc: 28
   pornhub.com: 27
   qii.pl: 26
   com.tr: 26
   5v.pl: 26
   hostami.me: 24
   freebitco.in: 24
   porn.com: 24
   skokka.com: 24
   txxx.com: 22
   com.es: 21
   blueseek.com: 20
   tourstogo.us: 20
   imrworldwide.com: 20
   host.sk: 20

....
StevenBlack commented 2 years ago

Robert @robertvargic same thing, compared to the top of our porn file (which includes our main list).

Seems to concord.

$ ghosts -m p --root

----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/alternates/porn/hosts
Domains: 133,610
Bytes: 4.0 MB
Root domain tally:  (58,468 unique root domais)
   blogspot.com: 9,187
   cloudfront.net: 3,434
   tumblr.com: 1,834
   2o7.net: 1,383
   ahcdn.com: 1,330
   intellitxt.com: 794
   co.uk: 698
   actonservice.com: 586
   co.jp: 433
   ero-advertising.com: 404
   omtrdc.net: 397
   com.pl: 376
   hitbox.com: 362
   com.au: 350
   openx.net: 285
   com.br: 274
   waw.pl: 262
   nazwa.pl: 219
   2mdn.net: 215
   p2l.info: 198
   doubleclick.net: 179
   dotomi.com: 170
   daraz.com: 158
   edgekey.net: 149
   booru.org: 147
   mtree.com: 143
   marketo.com: 132
   oewabox.at: 125
   preview-domain.com: 122

...
dnmTX commented 2 years ago

Check this out πŸ˜„ HOSTS txt

That's porn alright πŸ˜‰

StevenBlack commented 2 years ago

Thank you for the suggestion Robert @robertvargic, but I decline.

Closing.