Closed jarelllama closed 6 months ago
Hello,
I checked https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt and noticed that most of these sites either display a white screen
, a file explorer
, or messages such as Welcome to our website Coming Soon
and Is this your domain? Get it online with cloud-based Shared Hosting, complete with high-performance servers, scalable plans, and free SSL
, along with other details about domain sales by GoDaddy. I found no content related to hate or junk, so I definitely plan to categorize your list under Useless websites
. Let me know if you consent to the addition, or present any other suggestions if you have them.
For other block lists, please provide a specific GitHub link and describe the appropriate categories for the list. I want to mention that I am generally disinclined to add lists that are forks of existing lists.
Thanks (:
Thanks for the swift reply! My bad, I misunderstood the Hate & junk
category. I agree that the parked domains list fits the Useless websites
category better.
Regarding my main Scam Blocklist, here are the formats it is available in: | Format | Syntax |
---|---|---|
Adblock Plus | ||scam.com^ | |
Dnsmasq | local=/scam.com/ | |
Unbound | local-zone: "scam.com." always_nxdomain | |
Wildcard Asterisk | *.scam.com | |
Wildcard Domains | scam.com |
Note that the blocklist does not source from any other existing GitHub blocklist, instead, I implemented my own sourcing such as via Google Search API, scam reporting sites like scamadviser.com, and a malicious NRD detector.
Here is the current list of sources implemented:
Google Search
Regex matching for malicious NRDs
aa419.org
dnstwist matching for malicious NRDs
guntab.com
petscams.com
scam.directory
scamadviser.com
stopgunscams.com
These sources were chosen because I have yet to see any other blocklist implement them.
The domains are retrieved from these sources automatically and daily using the open-source scripts in my repository.
Regarding categories, these few seem to fit the intentions of the blocklist:
Malicious
Phishing
Fraud
Scam
Thanks. I will get to it soon. I am currently working on another project of mine. I will get back to you soon (:
Thanks for the update!
Hello again,
added in the commit: https://github.com/sefinek24/Sefinek-Blocklist-Collection/commit/fc8370067ba0eac8bc5e463f23d2722923fbe747
I will soon add your lists to the list in markdown files and on the sefinek.net website. I want to check them more thoroughly.
You really did a great job, I admire it. It would be great if you could expand the repository, for example by enriching it with domains dedicated to cracking softwares or simply pirating. There are many possibilities (:
Thanks for the kind words! It means a lot. Do note the dead domains and parked domains file are both capped at the 8000 and 7000 newest domains respectively.
I also noticed you're using NoTracking as source. It recently got archived with more info here: https://github.com/notracking/hosts-blocklists/issues/900
Regarding other areas of blocking, I did entertain the idea of blocking NSFW sites using the Google Search API. It was rather easy to retrieve hundreds of sites just using a few common search terms. However, I'm currently on the free tier of the API which limits me to 100 queries a day (I even created a second Google account so I can use two API keys to get pass the rate limit).
I would certainly not have enough API queries or the personal time to maintain more blocklists sadly.
Thanks again for the support!
Regarding the notracking/hosts-blocklists, indeed. Thanks for bringing that up. I will soon remove their lists.
It would be really beneficial to implement larger block lists. I understand that API limitations are a major issue, as well as time, of course... Therefore, if you're interested, I think it would be a good idea to merge your repository with mine. I have no objections to this. I would add you as a collaborator here, and together we could manage the lists. There is strength in collaboration :) I believe this approach would significantly enhance user security for those utilizing the lists. My lists (blocklist.sefinek.net) receive a substantial number of server queries daily. Please check the statistics at the bottom of this page if you're interested.
I look forward to your reply and your decision.
Thanks for the offer! But I doubt I could be of much help. I see most of your repository is in JavaScript, which I can't contribute too. I've already briefly reviewed your bash scripts and they're all well done.
Having my scam blocklist used as a source seems to be the biggest contribution I can offer right now.
Alright, if you need anything, feel free to write
Would it be possible to also add my list under phishing? Although most of my sources focus on scam sites, the sources retrieving from the NRD feed tend to be phishing domains (googgle.com, whattsapp.com, and such).
I think I might just leave it as it is for now, because I'm going to sleep now <:
No pressure!
Hi again, I just added your lists to the generator (; https://sefinek.net/blocklist-generator/pihole
Thanks again!
Hi, I'm the maintainer of https://github.com/jarelllama/Scam-Blocklist , a blocklist for newly created scam and phishing domains automatically retrieved using Google Search API, automated NRD detection, and other public sources.
Seems to fall under your categories of
As part of my filtering process, a list of parked domains is generated as a byproduct: https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/data/parked_domains.txt This list seems to fit your categories of
This parked domains list is capped at 7000 entries and is updated daily (unparked sites are automatically removed too)