hl2guide / DNS-Sinkhole-Lists-A2

A DNS Sinkhole List for testing purposes. (not for use in production systems) - UUID: 0f90ca2c-4b0a-4fbe-b659-449ab30c4284
MIT License
20 stars 3 forks source link

(not an issue): Testing List - Shard #4 #1

Closed hl2guide closed 5 years ago

hl2guide commented 5 years ago

Discussion about testing Shard #2 list.

EstherMoellman commented 5 years ago

Hi Dean @hl2guide ! I have the results of the first test:

INTRODUCTION: Approximately 30 users, from 8 different countries (all over the world), participated in this test.

Unfortunately the real-life test wasn't 100% possible, because for real-life test users needed common websites (social media etc), and they were blocked in your list. This also killed many days of the test, because some users (in order to start the test) were waiting for an updated list with these websites unblocked.

Another issue, in my opinion, was to make the test without steps, without differentiating first an "ads" test, second a "trackers" test etc. I blame myself for that, because this was my original intention from the very beginning, but I wasn't enough clear to you, and I couldn't express you my plan in the right way. I mention this issue in case you are interested in future tests, so in my opinion, it will be better to have separate lists (ads, trackers, social media, mining, whatever), and to test list by list. This also will be very useful for those users (like me) that like to choose lists (sometimes these users need just a tracker-blocking-list, sometimes just an ads-blocking-list etc, not all users want all lists at the same time). It will be a plus if users can choose the blocking lists they want. PS: Personally I don't think you need to build all the lists at the same time. In my very personal opinion, the tracker-blocking-list is the most important list. The second list will be ad-blocking-list. All the other lists will be a plus, but I don't consider them a priority.

The test was focused on ads and trackers (not malware, miners etc).

TEST RESULT - YOUR LIST: Considering the small size of your list, it is a powerful list. It succeeded to block lot of garbage. However:

ABOUT "TECHNITIUM SOFTWARE / DNSSERVER": The browser speed was good. The RAM' consumption was ±500MB. Very easy to install and use.

ABOUT "DNSCrypt": We achieved same browser speed with DNSCrypt, using just ±25MB... unbeatable! Yeah, I know you didn't have a good experience using DNSCrypt. But I must insist you, in my opinion this is the tool you are looking for. DNSCrypt is unbeatable. It requires from you, first to read the Wiki. It is a long Wiki (boooring LOL), but believe me, it is worth reading this Wiki. We did that, it took time, but it wasn't difficult to understand, and we consider our self average-users. In brief, DNSCrypt has almost everything DnsServer has, but DNSCrypt is much more efficient in terms of performance, and also has some functions DnsServer hasn't. Also, DNSCrypt is fully customizable. But I grant you, DNSCrypt hasn't the friendly UI DnsServer has. This is the only point where DnsServer wins. However, DNSCrypt has a kind of third-party UI named "SimpleDNSCrypt", which covers 80% of DnsServer offered options. The other 20% options can be achieved at DNSCrypt by doing a simple quick manual customization (a list of very simple commands explained at Wiki, that can be written in the "ini" file of DNSCrypt). Here in my bad English it sounds more complex than really is, when you read the Wiki, and you play with DNSCrypt for 2 or 3 days, it becomes a piece of cake, super easy to use. Average users are not going to read the Wiki, and they are not going to be interested in DNSCrypt. In this case, they can use "SimpleDNSCrypt", that works like a "plug and play", it takes 1 minute to install, select very few options, and that's all, ready to go.

More important, and you will love this: DNSCrypt has an extra software named "GENERATOR". There you can write all the blocking-list-urls you want. And when you execute "GENERATOR", the software first uploads all the lists you chose, then compiles deleting repeated domains, and finally it builds the unified-blocking-list for DNSCrypt. It is fantastic! Not to mention that also you can add your own white-list with unblocked domains, wildcards are totally compatible etc, and "GENERATOR" merges everything in one sole list. Amazing!

More details:

YOUR REQUEST: You asked for:

It is more than 1MB. But you can decide what domains to delete or not. Some of the domains were added by us. Other were added from logs showing unblocked garbage. Other were added from updated blocking-lists (we only used updated lists).

I hope something of our collaboration will be useful to your project. Please, share with me your thoughts, and let me know what are going to be your future steps. And keep counting with our collaboration.

Thank you and big hug! : )

hl2guide commented 5 years ago

:D Thanks for all the nice, detailed results. It'll take me some time (about 5 - 7 days) to parse it out, fix and then create a social media variant. Real-life work commitments are delaying this work so please bare with me.

hl2guide commented 5 years ago

I need some more time to resolve the results. Real-life work is getting in the way big time. Maybe by the end of September?

EstherMoellman commented 5 years ago

@hl2guide , take all the time you need. Please just one question: How do you build your blocking-list? If you use already existent lists (EasyList, EasyPrivacy, FanBoys etc), do you mind to share with me the url links you use?

hl2guide commented 5 years ago

I generate the combined list using a custom-written PowerShell script that: 1) downloads the lists as text files 2) removes comments and junk lines 3) combines into one list 4) removes duplicates 5) sorts the final list 6) creates variants 7) commits & pushes to GitHub

The reason I don't share the PowerShell generator script is because I don't want lists to be bombarded with far too much traffic that could make them unavailable.

Included lists are listed here: https://github.com/hl2guide/DNS-Sinkhole-Lists-A2#included-lists

Please Note: Stephen Black's list already includes plenty of domain lists. It is the basis of the list I create.

hl2guide commented 5 years ago

To give a bit of an update on the status of the project:

hl2guide commented 5 years ago

bild.de has been added to whitelist.

Porn webpages are not blocked (and this is good), but porn videos are blocked (this is bad). As you know, most of the users like porn pages, and if they have problems with your list, they will abandon your list.

Response: I've tested and the following have passed with no ads shown (with adblocker off):

Could you please provide a list of examples for this?

Sometimes when an ad is blocked in a webpage, a kind of banner appears (occupying the blocked ad place) with the following text: “Hmm. We’re having trouble finding that site". In a webpage with lot of blocked ads, lot of these banners appear.

Response: This is because for within a browser there's a concept of "cosmetic filtering" so if an iframe or sub-element refers to a successfully blocked domain that message is normal and expected. For cosmetic filtering you'd need an in-browser adblocker like uBlock to hide the element.

hl2guide commented 5 years ago

N.B. Currently my personal Block List URLs I'm testing are:

https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/0_combined_blocklist.txt https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/Extras/extra-domains-to-block.txt https://raw.githubusercontent.com/anudeepND/blacklist/master/adservers.txt https://raw.githubusercontent.com/FleuryK/pihole-ytadblock/master/ytadblock.txt

EstherMoellman commented 5 years ago

Hi Dean @hl2guide !

I am testing combing more rules using wildcards, for better preformance

From my ignorance I would like to say: At least at DNSCrypt, a 1MB list has similar system performance than a 10MB list. Everything is loaded into RAM, so it will be very fast to block stuff (independently on size of the blocking-list). However, of course, you're right, a small list always will be preferred! But what I'm trying to say, is that personally I believe that the real challenge is not the "size of the list", but the stuff inside of the list. And the real difficulty is to make a global efficient blocking-list... very hard to build and maintain. For example, an efficient list for Aussies may not be efficient for other continents, and vice-versa. Not to mention that even if you build the best global list, it' ll be very difficult to maintain. Here is the big problem! That's the reason I prefer "to think small": a) I believe that is better to build separated lists for categories b) In my opinion, trackers is the most important category c) Ads is the second list in priority d) I wouldn't go into a third category, if I'm not sure that my "trackers" and "ads" lists are good enough for global use e) To block IPs (trackers and ads) is the most efficient blocker way, even more efficient than using wildcards or REGEXPs If by chance you can build an IPs' list for trackers, and another one for ads, with global coverage... believe me, you will have the winner blocking-list. If I can help you... count on me!

Could you please provide a list of examples for this?

As a nice lady (LOL), I know nothing about porn websites (LOL)... but I love porn! (LOL), nothing against it!... I just have no time for porn websites (LOL)... I'm a university student, and also have a work besides my studies. And this with other activities left me without time for porn websites. And I tried to make a test for you by "googling" the word "porn, sex etc", but I have no idea which one of the thousand porn websites that appeared are really important to test. Also, I sent an email asking my well-known-porn-users friends (LOL), and they answered me with a general replay: "You'll see these porn websites at google, they are everywhere"... personally I believe my friends get embarrassed with my query (LOL). I'll try to identify a list of the 20 global most used porn webpages (please, feel free to collaborate), and after having them, I'll test for you.

Response: This is because for within a browser there's a concept of "cosmetic filtering" so if an iframe or sub-element refers to a successfully blocked domain that message is normal and expected. For cosmetic filtering you'd need an in-browser adblocker like uBlock to hide the element.

OK, but when I block with DNSCrypt, these iframes/sub-element etc don't appear. So, I guess we can ask for help, trying to discover how to block this message. By the way, I did a quick test (without blocking-lists) at this webpage (https://gfycat.com/alivejuicyarcticduck), it has ads on top and on the right-hand, and at FF' Nightly I played with "shift + ctrl + I => inspector", there I saw the url of the add, and a different url for the iframe/sub-element. And when I blocked this last url, everything ended blocked, including the message. So, I wonder if by discovering a list of these urls, perhaps we can block this message? In fact, if you use DnsServer with your list, the message also is blocked. I believe this may be an indication that this message can be blocked by blocking some urls.

PS1: UBlock is a dinosaur. I prefer your project! It was the best solution years ago. Today UBlock is a performance killer. Your project, blocking at system level, using proxy-hole etc, this is the best solution! What you're doing is the right way to deal with web garbage.

PS2: Thanks for sharing your personal lists... I will test them next week.

Dean, thanks again and congratulation for your project... I love it! Take all the time you need, and count on me if I can help you.

Big hug! : )

hl2guide commented 5 years ago

Over the next 2 weeks I'll be:

hl2guide commented 5 years ago

Shard 3 is out with the above changes and also typosquatting domains added 💃

hl2guide commented 5 years ago

In regards to blocking YouTube ads I've tested some lists and they are lacking effectiveness.

The issue is that YouTube intentionally rotates ads to be served from different domains and also serves both videos and ads from the same domain. So if you block that domain you block some videos too. I think it'll require the use of a regular expression and some user reporting effort.

I personally use MPV for viewing and youtube-dl for downloading instead.

The lists I'm currently testing:

https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/COMBINED_LISTS_GENERATED/0_combined_blocklist.txt https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/Extras/extra-domains-to-block.txt https://raw.githubusercontent.com/stamparm/blackbook/master/blackbook.txt https://raw.githubusercontent.com/hemiipatu/Blocklists/master/advertisement.txt https://raw.githubusercontent.com/hemiipatu/Blocklists/master/phishing.txt https://raw.githubusercontent.com/hemiipatu/Blocklists/master/ransomware.txt https://raw.githubusercontent.com/hemiipatu/Blocklists/master/scam.txt https://raw.githubusercontent.com/hemiipatu/Blocklists/master/spam.txt

Any chance you have more domains to add to: https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/Extras/extra-domains-to-block.txt ?

EstherMoellman commented 5 years ago

... yeap, agree with you... youtube' ads are better blocked with RegExps, CSS etc. Also, I use a JS' script that may be useful to you (https://greasyfork.org/en/scripts/370461-invidious-redirect/code), it automatically redirects youtube to invidious... I took the script from Invidious' GitHub, so is safe, and is very tiny/lightweight, works like a charm. My only ads problem with youtube are the embedded youtubes at webpages or video players... those are difficult to block. And I don't really want to hurt browser performance with add-ons or extensions.

With regards to adding domains, if I am not wrong, I believe the link you attached are the same domains I sent you days ago (domains not included in your old blocking-list). If it is the case, please, I dare to suggest you (from now on) to apply a new process for testing/adding domains from my side:

1) Please, sent me the links of the lists you want me to test/add-domains, separated by "ads" and "trackers".

2) Please, always sent me global lists.

3) Please, if by chance you can build IPs-blocking-lists... this will be "golden-dust"... the best of the best for testing.

If you agree and you can follow the 3 points above (or part of them), then I believe I can be more useful for your project, because I can use my small club of 150 FF' members as testers. The 150 testers experience will be always much more powerful than my sole isolate tester experience. But I cant use these 150 testers all the time... perhaps once a month will be OK. Also, any test always is going to be global, because the 150 testers are living in at least 8 different countries. At last but not least, the 150 testers are average-users, far away of being advanced-users. So I suggest to go by steps, testing first only trackers blocking-lists, secondly ads-blocking-lists, and in a third stage we can include other blocking-lists (social-media, malware, phishing, scam etc).

This is just my personal suggestion. You have the last word. Please feel totally free to deal and manage the process as you want.

If I can help, count on me! : )

hl2guide commented 5 years ago

From what I can tell separation of domains into trackers, ads etc is impossible because of the structure of the lists I've included. Simply it's already all mixed up.

IP-blocking-lists may be useful, I'll look into it over the next 2 months.

A useful outcome from your testers would be the reporting of conflicted domains for broken video players and popular sites. Essentially the reporting of false-positives and mistakes.

Using this Google Form: https://docs.google.com/forms/d/e/1FAIpQLSelgN7d68f4nbqU6SyqluIpwSqE5g-mDvAI6O84IIyHHB2YOA/viewform?usp=pp_url

I've noticed that video players often serve videos from multiple domains that have uncommon (blocked) suffixes.

Many thanks to you and the testers 💃 🎉

FYI: over the next 30 or so days I'll be busy with real-life work.

hl2guide commented 5 years ago

FYI, I'm currently using:

https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/COMBINED_LISTS_GENERATED/0_combined_blocklist.txt https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/Extras/extra_domains_to_block.txt https://raw.githubusercontent.com/hl2guide/DNS-Sinkhole-Lists-A2/master/Extras/privacybadger_blocklist.txt https://raw.githubusercontent.com/deathbybandaid/piholeparser/master/Subscribable-Lists/ParsedBlacklists/WindowsSpyBlocker7.txt https://raw.githubusercontent.com/deathbybandaid/piholeparser/master/Subscribable-Lists/ParsedBlacklists/WindowsSpyBlocker81.txt https://getadhell.com/standard-package.txt

EstherMoellman commented 5 years ago

Hi @hl2guide Dean!

A new add-on appeared at Firefox: https://addons.mozilla.org/en-US/firefox/addon/javascript-firewall/ It is a kind of UMatrix "Mini-Me" version, capable of blocking webgarbage emulating the fancy matrix way (extremely friendly GUI for users). It is far from being just another ad-blocker-add-on. It is a firewall! It will block stuff before it loads. It will prevent stuff from being loaded. And it will do that in a very efficient way, with a low negative browser impact. It will block most of the ads/trackers without breaking webpages... blocking-lists are not needed.

Also, there is another add-on: https://addons.mozilla.org/en-US/firefox/addon/rule-adblocker/ It is an ad-blocker, but it is an unconventional one. This is because this add-on works only with RegExps. So, it is very efficient and lightweight. The add-on already comes with 28 RegExps, but users can delete them, and can add their own RegExps.

All this explanation above to say to you that I am testing a new approach, by using this two add-ons. No, I am not abandoning blocking-lists, pi-hole proxy solutions, your project etc... I am not abandoning nothing. I am just testing a new approach for blocking webgarbage.

Question: What is the "gain" (competitive differential) of these two add-ons? Answer: They are "plug-and-play", just one time job configuration and that's all. And compared to other add-ons, these two add-ons are extremely lightweight. PS: Add-ons are not working at OS' level (for example Microsoft' telemetry). But a small blocking-list at HOSTS' file will solve this.

I escaped from add-ons because they were browser performance killers, privacy invaders etc. And this is the main reason I ended working with blocking-lists and pi-hole-proxies. However, blocking lists are extremely difficult to be maintained. So, if we have an alternative based on add-ons/extensions, that doesn't hurt browser performance, privacy and security... well, IMHO at least it is worth trying this alternative.

hl2guide commented 5 years ago

You may like: https://www.ghacks.net/2019/10/09/opera-64-launches-with-built-in-tracker-blocker/

EstherMoellman commented 5 years ago

... yeah, I saw the news day ago, but thank you anyway for the article.

Firefox is my choice. I have millions of complains against Firefox. But I strongly believe that the existence of Firefox is critical in order to avoid chrome' monopoly (Opera is a chrome' fork).

I don't choose Firefox because of privacy (despite being great at privacy/security). I just prefer Firefox as a way to fight against browser monopolies. Not to mention that months ago Edge decided to use chrome, worsening the situation. In brief, today around 90% of the market is chrome... and this seems to me to be extremely dangerous!

Also, Firefox keeps a customizable capability that no other browser has. For example, by using simple CSS' scripts, I have hard customizations of the appearance and functions of my Firefox (impossible to be done at any chrome). The same goes for JS' scripts, I have plenty of JS' scripts making my Firefox' browsing a spectacular experience (at chrome, this is possible only through extensions, hurting browser performance, privacy, security etc). In parallel, also by using simple CSS' scripts, I have tons of webcontent customizations, without add-ons (almost zero impact on browser performance). And Firefox' internal flags (preferences) allow to change almost everything. Again, no other browser can compete with this.

Firefox has several internal blockers (built-in) for trackers, fingerprint, cookies, mining etc. But I don't use them because their blocking capabilities are very limited. It's great for average-users, but not for my profile. Chrome is even worse at this context, because at chrome the internal blockers are super-mega-limited. And recently, google chrome decided to limit the blocking capability of extensions, under the stupid argument that "it hurts performance". The truth is that chrome needs profits coming mainly from advertising/tracking.

Chrome might be translated into two words: Advertisement & Tracking. I'm totally in favor of having profits... there is no free lunch! But I'm against monopolies, and aggressive strategies of tracking and invasive ads.

hl2guide commented 5 years ago

Just so you know development has been really slow for October. I've been too busy with work commitments. November should speed up activity 😸 I'm going to be optimizing the list more.

EstherMoellman commented 5 years ago

... take all the time you need!

I still believe that your pi-hole proxy approach is the best solution against web-garbage, but (in my ignorant opinion) dealing with blocking-lists is almost an impossible task.

The only alternative should be to work with IPs. I wonder why no one is working with IPs' blocking-lists or similar. One may argue that IPs' numbers change a lot. But I'm sure that DNS' names change faster than IPs' number.

Until I find a better alternative, nowadays I'm using the pi-hole-proxy approach only in order to block Microsoft' garbage (at OS' level). So, my blocking-list is microscopic, no system performance impact. At the browser, I have a firewall extension, very tiny/lightweight, blocking third-party garbage (JS, XHR and FRAMES). This blocks 70% of webgarbage, with almost no webpage breakage, and no negative browser performance impact. For pesky adds (like youtube), I refurbished an abandoned extension, also very tiny/lightweight, that works only with RegExps, so the blocking is very surgical, not impacting browser performance. Until today, this is the best combo I found, is far from being perfect but blocks 90% of webgarbage with the less negative system performance impact. Also, this combo is plug-and-play, works automatically, no need to be maintained.

hl2guide commented 5 years ago

Nice to see you're having some success 👍

I hope that DNS Server software matures to a point that it has:

I've looked into IP blocking but it seems like a wasp hive I don't want to disturb.

System-wide blocking is my preferred outcome.

hl2guide commented 4 years ago

I'm currently in very early stages testing out Acrylic DNS Proxy: https://mayakron.altervista.org/wikibase/show.php?id=AcrylicHome

It may interest you as it supports: