hagezi / dns-blocklists

DNS-Blocklists: For a better internet - keep the internet clean!
GNU General Public License v3.0
5.12k stars 174 forks source link

TIF Medium and Light? #2267

Closed RicoHeat closed 4 months ago

RicoHeat commented 4 months ago

Gone forever? The full version of TIF is way too big. Any plans on making available the tif versions used for Multi Light and Multi Normal?

hagezi commented 4 months ago

They were actually only used for the build. But because some AdBlockers have problems with the TIF because it's too big, I'm trying to put together a version in an acceptable size. What is an acceptable size for you?

hagezi commented 4 months ago

@RicoHeat I have uploaded the updated tif medium. It is compiled from the most important sources and contains ~150000 rules. At the moment I see no reason to provide a smaller - light - list.

https://github.com/hagezi/dns-blocklists/commit/1e551551fb4b4a588f5c0acd8adbd79f459fe249

RicoHeat commented 4 months ago

@RicoHeat I have uploaded the updated tif medium. It is compiled from the most important sources and contains ~150000 rules. At the moment I see no reason to provide a smaller - light - list.

1e55155

Sounds good.. Thanks my friend. Now according to your ReadMe file the Multi Light version also has tif partially integrated.. How to determine which to include in the compressed version which currently stands at around 68967?

iam-py-test commented 4 months ago

Out of curiosity, what sources are included in the "medium" version? Thanks

hagezi commented 4 months ago

Out of curiosity, what sources are included in the "medium" version? Thanks

Base Sources:

 Nr |   Count | Format  | Source | Status  | File      | URL/File
   1 |     230 | domains | local  | online  | changed   | deny_tif.txt
   2 |   28988 | hosts   | http   | online  | unchanged | https://malware-filter.gitlab.io/malware-filter/phishing-filter-hosts.txt
   3 |  166591 | domains | http   | online  | changed   | https://phishing.army/download/phishing_army_blocklist.txt
   4 |    4997 | hosts   | http   | online  | unchanged | https://raw.githubusercontent.com/durablenapkin/scamblocklist/master/hosts.txt
   5 |    3920 | adblock | http   | online  | unchanged | https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/lists/adblock/scams.txt
   6 |   42968 | domains | http   | online  | changed   | https://raw.githubusercontent.com/elliotwutingfeng/GlobalAntiScamOrg-blocklist/main/global-anti-scam-org-scam-urls-pihole.txt
   7 |     409 | hosts   | http   | online  | unchanged | https://raw.githubusercontent.com/hoshsadiq/adblock-nocoin-list/master/hosts.txt
   8 |    3771 | domains | local  | online  | unchanged | malware.base.domains.txt
   9 |    3394 | hosts   | http   | online  | changed   | https://malware-filter.gitlab.io/malware-filter/urlhaus-filter-hosts.txt
  10 |   16842 | hosts   | http   | online  | unchanged | https://malware-filter.gitlab.io/malware-filter/vn-badsite-filter-hosts.txt
  11 |   28495 | hosts   | http   | online  | changed   | https://threatfox.abuse.ch/downloads/hostfile
  12 |     246 | hosts   | http   | online  | changed   | https://urlhaus.abuse.ch/downloads/hostfile
  13 |   21394 | adblock | http   | online  | unchanged | https://raw.githubusercontent.com/DandelionSprout/adfilt/master/Alternate%20versions%20Anti-Malware%20List/AntiMalwareAdGuardHome.txt
  14 |   21240 | domains | http   | online  | unchanged | https://dl.red.flag.domains/red.flag.domains.txt
  15 |    6468 | domains | http   | online  | changed   | https://raw.githubusercontent.com/elliotwutingfeng/Inversion-DNSBL-Blocklists/main/Google_hostnames_light.txt
  16 |   18040 | domains | http   | online  | unchanged | https://zonefiles.io/f/compromised/domains/live/
  17 |    2683 | adblock | http   | online  | unchanged | https://raw.githubusercontent.com/uBlockOrigin/uAssets/master/filters/badware.txt
iam-py-test commented 4 months ago

Thanks

hagezi commented 4 months ago

@iam-py-test Any other recommendations? My goal was that the sum of Ultimate and TIF medium does not exceed ~500000 rules. Then this combination could also be used under AdgGuard iOS. Due to iOS RAM limitations, the sum of all rules is about ~535000.

hagezi commented 4 months ago

Now according to your ReadMe file the Multi Light version also has tif partially integrated.. How to determine which to include in the compressed version which currently stands at around 68967?

It is relatively simple, the smaller the list or the lower the level, the fewer "TIF" domains are included. To keep the normal lists small, I have separated the TIFs. Especially as in some sources for the TIFs there are incredible jumps in size between the updates.

Furthermore, malware and phishing domains are usually only active for a few days, so you are "cluttering up" the normal lists. Partial means that you will certainly find a few domains from the TIF on the normal lists.

hagezi commented 4 months ago

@RicoHeat

Version TIF domains TIF medium domains
Light 9277 6862
Normal 15831 11832
Pro 34164 14441
Pro++ 37933 17563
Ultimate 71081 47100
iam-py-test commented 4 months ago

Any other recommendations?

No, I think it's a good selection (not that my opinion means anything) Thanks