sefinek / Sefinek-Blocklist-Collection

🌍 A comprehensive repository of blocklists for various DNS servers, featuring over 100 links and more than 6 million domains. Additionally, you can use our personalized Blocklist Generator to tailor content filtering according to your specific needs, giving you full control over what gets blocked on your network.
https://blocklist.sefinek.net
Other
554 stars 26 forks source link

Source suggestion #24

Closed jarelllama closed 6 months ago

jarelllama commented 6 months ago

Hi, I'm the maintainer of https://github.com/jarelllama/Scam-Blocklist , a blocklist for newly created scam and phishing domains automatically retrieved using Google Search API, automated NRD detection, and other public sources.

Seems to fall under your categories of

Malicious
Phishing
Fraud
Scam 

As part of my filtering process, a list of parked domains is generated as a byproduct: https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt This list seems to fit your categories of

Hate & junk
Useless websites

This parked domains list is capped at 7000 entries and is updated daily (unparked sites are automatically removed too)

sefinek commented 6 months ago

Hello, I checked https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt and noticed that most of these sites either display a white screen, a file explorer, or messages such as Welcome to our website Coming Soon and Is this your domain? Get it online with cloud-based Shared Hosting, complete with high-performance servers, scalable plans, and free SSL, along with other details about domain sales by GoDaddy. I found no content related to hate or junk, so I definitely plan to categorize your list under Useless websites. Let me know if you consent to the addition, or present any other suggestions if you have them.

For other block lists, please provide a specific GitHub link and describe the appropriate categories for the list. I want to mention that I am generally disinclined to add lists that are forks of existing lists.

Thanks (:

jarelllama commented 6 months ago

Thanks for the swift reply! My bad, I misunderstood the Hate & junk category. I agree that the parked domains list fits the Useless websites category better.

Regarding my main Scam Blocklist, here are the formats it is available in: Format Syntax
Adblock Plus ||scam.com^
Dnsmasq local=/scam.com/
Unbound local-zone: "scam.com." always_nxdomain
Wildcard Asterisk *.scam.com
Wildcard Domains scam.com

Note that the blocklist does not source from any other existing GitHub blocklist, instead, I implemented my own sourcing such as via Google Search API, scam reporting sites like scamadviser.com, and a malicious NRD detector.

Here is the current list of sources implemented:

Google Search
Regex matching for malicious NRDs
aa419.org
dnstwist matching for malicious NRDs
guntab.com
petscams.com
scam.directory
scamadviser.com
stopgunscams.com

These sources were chosen because I have yet to see any other blocklist implement them.

The domains are retrieved from these sources automatically and daily using the open-source scripts in my repository.

Regarding categories, these few seem to fit the intentions of the blocklist:

Malicious
Phishing
Fraud
Scam
sefinek commented 6 months ago

Thanks. I will get to it soon. I am currently working on another project of mine. I will get back to you soon (:

jarelllama commented 6 months ago

Thanks for the update!

sefinek commented 6 months ago

Hello again,

added in the commit: https://github.com/sefinek24/Sefinek-Blocklist-Collection/commit/fc8370067ba0eac8bc5e463f23d2722923fbe747

I will soon add your lists to the list in markdown files and on the sefinek.net website. I want to check them more thoroughly.

You really did a great job, I admire it. It would be great if you could expand the repository, for example by enriching it with domains dedicated to cracking softwares or simply pirating. There are many possibilities (:

jarelllama commented 6 months ago

Thanks for the kind words! It means a lot. Do note the dead domains and parked domains file are both capped at the 8000 and 7000 newest domains respectively.

I also noticed you're using NoTracking as source. It recently got archived with more info here: https://github.com/notracking/hosts-blocklists/issues/900

Regarding other areas of blocking, I did entertain the idea of blocking NSFW sites using the Google Search API. It was rather easy to retrieve hundreds of sites just using a few common search terms. However, I'm currently on the free tier of the API which limits me to 100 queries a day (I even created a second Google account so I can use two API keys to get pass the rate limit).

I would certainly not have enough API queries or the personal time to maintain more blocklists sadly.

Thanks again for the support!

sefinek commented 6 months ago

Regarding the notracking/hosts-blocklists, indeed. Thanks for bringing that up. I will soon remove their lists.

It would be really beneficial to implement larger block lists. I understand that API limitations are a major issue, as well as time, of course... Therefore, if you're interested, I think it would be a good idea to merge your repository with mine. I have no objections to this. I would add you as a collaborator here, and together we could manage the lists. There is strength in collaboration :) I believe this approach would significantly enhance user security for those utilizing the lists. My lists (blocklist.sefinek.net) receive a substantial number of server queries daily. Please check the statistics at the bottom of this page if you're interested.

I look forward to your reply and your decision.

jarelllama commented 6 months ago

Thanks for the offer! But I doubt I could be of much help. I see most of your repository is in JavaScript, which I can't contribute too. I've already briefly reviewed your bash scripts and they're all well done.

Having my scam blocklist used as a source seems to be the biggest contribution I can offer right now.

sefinek commented 6 months ago

Alright, if you need anything, feel free to write

jarelllama commented 6 months ago

Would it be possible to also add my list under phishing? Although most of my sources focus on scam sites, the sources retrieving from the NRD feed tend to be phishing domains (googgle.com, whattsapp.com, and such).

sefinek commented 6 months ago

I think I might just leave it as it is for now, because I'm going to sleep now <:

jarelllama commented 6 months ago

No pressure!

sefinek commented 6 months ago

Hi again, I just added your lists to the generator (; https://sefinek.net/blocklist-generator/pihole

And here: https://github.com/sefinek24/Sefinek-Blocklist-Collection/commit/e98f475cb3fff8bfcebecb47ed41250c1c656ff3

jarelllama commented 6 months ago

Thanks again!