A question - Githubissues

Dynasty-Dev commented 1 year ago

Hello, thank you for your lists, I am using them in AdGuard home and it’s been good.

I have a question, how come IP address are stripped from the threat intelligence feed list? There are blocklists there which contains ip addresses. Just wondering because I have another malware blocklist for AdGuard home which has IPS and AdGuard home has this capability.
Another question is can you add the next dns threat intelligence feeds into yours? It worked great when I was using nextdns so I’m wondering if you can add more sources from there too. https://raw.githubusercontent.com/nextdns/metadata/master/security/threat-intelligence-feeds.json
can you create a commonly abuse tlds blocklist for pihole, AdGuard home etc? There are many sources like the one from yokkfing nextdns config since you already have other abuse blocklists like ddns and badware.
How come you don’t add native device tracking to all of your blocklists?
How come there are other blocklists called “pro.extension” for example. What are these, and can you share its source/domains
Can you add malware filters full urlhaus filter instead of the online version?
How do you whitelist domains? Can you provide a source or reason? I see a bunch of weird domains being whitelisted that are included in a bunch of other lists. For example the zooplus.es domains

thank you

hagezi commented 1 year ago

Hi

thanks for your feedback, on your questions:

Since these are pure DNS domain lists, the IPs are not included. An extension to include the IPs - for formats that support this - is being planned. But I have not started with that yet.
From the NextDNS threat intelligence feeds, my TIF actually include all the lists that are still maintained, but NextDNS TIF also include many out dated lists that I have not taken over. They still use other unlisted sources.
It already exists, but has not yet been linked in the readme, see: AdGuard: https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/spam-tlds.txt Pihole (Regex): https://raw.githubusercontent.com/hagezi/dns-data-collection/main/regex/most-abused-tld.txt I will include the missing ones from @yokoffing.
The native tracking lists are included in my personal list and this list is included in all other lists.
The extension lists contain popular domains from the known sources that were compiled individually for each list version. To check the popularity I use the Cisco Umbrella 1M Toplist.
It is already included in TIF. Because I also have to watch the size of the other lists, I only use the online version there.
My Whitelist / Whitelist referral, see also Referral Domains. To optimize the size of the lists, additional garbage is sorted out in the build process, such as the zooplus domains, see for example ignore zooplus. For the TIF list I use an extended list of top domains to sort out false positives and domains that have no place on a TIF list.

All the raw data for generating my blocklists that I share can be found in this repository. This repository also contains data that is currently not used.

Kind regards, Gerd

Dynasty-Dev commented 1 year ago

Thank you so much for all of your detailed responses! Though yesterday, I forgot to ask one more question. How come all of your blocklists whitelist the same domains, like for example even your maximum protection list (pro++) unblock google search ads meaning double click. Another example is it also allows Meta graph domains. Perhaps it just means it contains more sources meaning higher chance of a false positive. What I think would be great is if you made another list even more strict, (full coverage) and put a warning for users.

pictosun commented 1 year ago

As this sounds/like this thread is getting a discussion thread, I do have a question:

Within your NextDNS (https://github.com/hagezi/dns-blocklists#nextdns) you don't name the 'native' NextDNS List.

Does it mean, that you do not recommend them? Or just forgotten?

Dynasty-Dev commented 1 year ago

As this sounds/like this thread is getting a discussion thread, I do have a question:

Within your NextDNS (https://github.com/hagezi/dns-blocklists#nextdns) you don't name the 'native' NextDNS List.

Does it mean, that you do not recommend them? Or just forgotten?

Honestly it looks unmaintained (not updated since 3 years) and entries are questionable. I’ve gotten a few false positives while using the native device tracking in nextdns last year. https://github.com/nextdns/metadata/tree/master/privacy/native

Dynasty-Dev commented 1 year ago

Maybe @hagezi could make a pull request on the nextdns metadata repository and add his native tracking filters?

yokoffing commented 1 year ago

Within your NextDNS you don't name the 'native' NextDNS List. Does it mean, that you do not recommend them? Or just forgotten?

Their list sources are covered by other filterlists he includes.
Their allowlisting is garbage and the devs are not responsive to allowlist requests.

yokoffing commented 1 year ago

even your maximum protection list (pro++) unblock google search ads meaning double click. Another example is it also allows Meta graph domains

@Dynasty-Dev, depending on your platform, you could:

use OISD extra https://oisd.nl/downloadsXtra
use 1Hosts Pro or Xtra alongside hagezi lists

Also note that you will need an adblocker / content blocker for ads at some point. Cosmetic filters for ads especially. DNS can't do everything.

FWIW, because of breakage, I recommend users have stricter blocking at the browser-level and lighter blocking at the DNS level. So, for instance, you could run 1Hosts Pro or OISD Extra at browser-level and hagezi's Pro++ list at the DNS-level. Just depends on what platforms you're using and your threat model.

hagezi commented 1 year ago

Thank you so much for all of your detailed responses! Though yesterday, I forgot to ask one more question. How come all of your blocklists whitelist the same domains, like for example even your maximum protection list (pro++) unblock google search ads meaning double click. Another example is it also allows Meta graph domains. Perhaps it just means it contains more sources meaning higher chance of a false positive. What I think would be great is if you made another list even more strict, (full coverage) and put a warning for users.

The build process allows me to intervene in the whitlisting process for each list, which I do, because in the Pro++ e.g. googletagmanager.com is not whitelisted, in the other lists it is.

Blocking referral domains makes no sense from my point of view as long as they do not distribute ads on pages. You only hinder the user, the first links in the google search do not work, links from various mails do not work anymore, newsletter unsubscriptions are partly not possible anymore and more. By whitelisting www.googleadservices.com I allow the user to click the ad search results, nothing more. It is not used for displaying advertisements. The same applies to ad.doubleclick.net. The domains are usually called only when people click on them as well. I cannot block the display of such ad links in the search via DNS, so you should always use a content blocker in addition to a DNS blocker to hide these ads.

You can make any of my lists more aggressive and whitlist yourself if you use the removed domains as blocklist again. As mentioned here. If you want it a bit more aggressive, I would do it as recommended by @yokoffing, 1Host pro in the original version, or - if it has to be even stricter - 1Hosts extra in addition to the Pro++. At the moment I don't plan to provide an "ultimate" version. There are enough alternatives that you can use in addition.

hagezi commented 1 year ago

As this sounds/like this thread is getting a discussion thread, I do have a question:

Within your NextDNS (https://github.com/hagezi/dns-blocklists#nextdns) you don't name the 'native' NextDNS List.

Does it mean, that you do not recommend them? Or just forgotten?

Welcome, I think @yokoffing has answered the question sufficiently.

hagezi commented 1 year ago

I convert this issue into a discussion...

hagezi / dns-blocklists

A question #81