Closed jarelllama closed 6 months ago
If there is anyone else I should be tagging, please let me know. Thank you.
OISD also uses the list: ping @sjhgvr
Thanks for the heads up @jarelllama! I haven't incorporated your list yet, so no issues. Should be integrated within the next 3 or 4 updates.
Have you considered adding it to filterlists.com? Adding it is a bit complex, though. I'll open a PR when I get spare time, if your ok with it.
Also, I noticed the adblock version doesn't have an Expires time set. Given this is updated daily, in my opinion 1 day
would be good, but what do I know 🤷♀️
My two cents. Thanks for this filterlist!
Also, I noticed the adblock version doesn't have an Expires time set.
Ah I was unaware of that special comment! Will add accordingly to https://help.adblockplus.org/hc/en-us/articles/360062733293-How-to-write-filters
Thanks for the idea @iam-py-test !
@iam-py-test
Have you considered adding it to filterlists.com? Adding it is a bit complex, though. I'll open a PR when I get spare time, if your ok with it.
I looked through the process briefly and it does seem a little complicated. I'll look at it in detail when I have the time. If you want to open a PR it would be appreciated!
Is this look ok: https://github.com/iam-py-test/FilterLists/commit/23eaeea97d307243c8a49a6d0dc9ca67f7e5f9d1. The syntax isn't very self-explanatory, but hopefully you can check if I got the information wrong. I copied most of that from this repo (i.e. description). The only thing I changed as calling the "AdBlock Plus" one "adblocker syntax" because it can be used in other similar blockers (AdGuard, uBlock Origin, etc) so IMO that is more informative. Also, IMO this list doesn't work in ABP because it doesn't support the ability to block the entire website, so people just see broken scam pages and wonder why it doesn't work (uBo and AdGuard instead will show a warning page clearing showing what blocked the page).
Hi @iam-py-test I just looked through and everything seems right! Thanks so much. I see what you mean with the Adblock Plus. I definitely want users to know that it's only the syntax that's different. Perhaps we can put "Scam Blocklist (ABP syntax)"?
Change made and PR submitted (https://github.com/collinbarrett/FilterLists/pull/3569)
I appreciate the help!
@hagezi I saw under your Fake List sources that you're still using the old domains file whereas the TIF list is using the new one.
Oh, missed that ...
@jarelllama apparently there's another list with the same name and that causes problems: https://github.com/collinbarrett/FilterLists/pull/3569#issuecomment-1525689971 Any ideas? Not urgent
@iam-py-test is it a repo name problem? if not you can remain it to "Jarelllama's Scam Blocklist"
@iam-py-test thanks again for all the help!
Tagging @notracking as a reminder.
Since https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/domains.txt has not been updated for some time, I have switched to https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/lists/adblock/scams.txt
Hi @hagezi, my apologies. I made some big changes regarding the blocklist including introducing new formats. However, before I could notify list maintainers, Google blocked my method of extracting domains from Google Search so I decided not to notify list maintainers till I fixed the issue. I've been planning to use the Google Custom Search API but due to school and personal issues, I haven't had the time to update the code. Please allow me more time to update the code after I've dealt with my personal issues.
I've also been testing ways of making use of Control D's new AI malware detection to make a new blocklist. However, their AI model seems to have many false positives so I've ditched the idea. Now, in the limited free time I have, I've been testing ways to use NextDNS' AI Threat Detection instead to create a new malware blocklist.
To any list maintainers reading this, my deepest apologies for the lack of updates while I deal with issues in my personal life. This project will not be made dead! I just need time to update the code. I will update list maintainers of any major updates.
@notracking @hagezi @bongochong @iam-py-test
Firstly, thanks list maintainers for the support! While building a new list in the Adblock Plus format, this project went through major changes. The biggest one is that the domains blocklist
domains
is being renamed todomains.txt
.Here is the new link for the blocklist in Domains syntax: https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/domains.txt
Here is the new blocklist in ABP syntax if you prefer it: https://raw.githubusercontent.com/jarelllama/Scam-Blocklist/main/adblock.txt
I will keep the old domains file for a week before retiring it to the
legacy
folder. Keep in mind it will not be updated from today onwards. It will remain frozen at3798
domains. I apologize for any inconvenience.This project has changed a lot from the first issue opened on Hagezi's repo: https://github.com/hagezi/dns-blocklists/issues/926 For those interested, here is a list of major changes so you know exactly what you're adding to your aggregated lists.
Major changes
Loop through a list of search terms in
search_terms.txt
instead of manually entering a search term on each run.Use a time filter when searching to tweak the total number of domains retrieved and number of false positives. Current default is "past 3 years". Any higher will cause Google to block my IP address before I finish looping through all the search terms. Whereas a smaller time filter (for example, past month) will result in higher false positives.
Better filtering using a term-based whitelist. For example if the whitelist contains the term
scam
, domains likescam-detector.com
will be whitelisted. This helps reduce the number of false positives.Use a whitelisted TLDs filter for TLDs like
edu
andgov
.Check for resolving
www
andwww
-less domains. This is done by addingwww
to domains without it and removingwww
from domains with it. This ensures that bothtest.com
andwww.test.com
are in the list if they are resolving (only a concern for the domains list).Use of GitHub Workflows to automate checking of the blocklist. This includes: • Checking all lists for domains in the toplist whenever the toplist is updated. • Checking for whitelisted domains whenever the whitelist is updated. • Daily automated dead check. • Daily check for resolving
www
andwww
-less domains.As always, all the code is on the repo and I appreciate any feedback! If you have questions please do ask. Thanks everyone for the support and remember to switch to
domains.txt
.