T145 / black-mirror

Blacklists and whitelists built by open code, so you know what goes into them.
GNU Affero General Public License v3.0
188 stars 12 forks source link

List Pihole-compatible sources in a single-line text file #43

Closed T145 closed 3 years ago

T145 commented 3 years ago

Based off my idea in #42; tested by @pallebone.

jq -rj 'to_entries[] | select(.value.color == "black") | .value.mirrors | join (" "), " "'

Make a release step to print mirrors[0] for each source that contains a single domain filter using the mawk engine.

pallebone commented 3 years ago

following :)

T145 commented 3 years ago

@pallebone Here's a list rendition right now if you want to start testing:

https://gist.github.com/T145/3d8b1776fae9814f39afe4e8791921ab

pallebone commented 3 years ago

Thank you.

This is a long list. Will go through and check which entries are compatible and compile a list and post back when done. You might have to give me a couple of days.

Many thanks in advance :) Pete

pallebone commented 3 years ago

Hi,

I managed to do this already. Here is my finding:

Working: https://raw.githubusercontent.com/badmojr/1Hosts/master/Xtra/domains.txt https://raw.githubusercontent.com/Perflyst/PiHoleBlocklist/master/android-tracking.txt https://gitlab.com/andryou/block/raw/master/kouhai-compressed-domains https://raw.githubusercontent.com/privacy-protection-tools/anti-AD/master/anti-ad-domains.txt https://raw.githubusercontent.com/TheAntiSocialEngineer/AntiSocial-BlockList-UK-Community/main/UK-Community.txt https://raw.githubusercontent.com/anudeepND/blacklist/master/facebook.txt https://raw.githubusercontent.com/adversarialtools/apple-telemetry/master/blacklist https://blocklistproject.github.io/Lists/alt-version/abuse-nl.txt https://blocklistproject.github.io/Lists/alt-version/ads-nl.txt https://blocklistproject.github.io/Lists/alt-version/crypto-nl.txt https://blocklistproject.github.io/Lists/alt-version/drugs-nl.txt https://blocklistproject.github.io/Lists/alt-version/facebook-nl.txt https://blocklistproject.github.io/Lists/alt-version/fraud-nl.txt https://blocklistproject.github.io/Lists/alt-version/gambling-nl.txt https://blocklistproject.github.io/Lists/alt-version/malware-nl.txt https://blocklistproject.github.io/Lists/alt-version/phishing-nl.txt https://blocklistproject.github.io/Lists/alt-version/piracy-nl.txt https://blocklistproject.github.io/Lists/alt-version/porn-nl.txt https://blocklistproject.github.io/Lists/alt-version/ransomware-nl.txt https://blocklistproject.github.io/Lists/alt-version/redirect-nl.txt https://blocklistproject.github.io/Lists/alt-version/scam-nl.txt https://blocklistproject.github.io/Lists/alt-version/smart-tv-nl.txt https://blocklistproject.github.io/Lists/alt-version/tiktok-nl.txt https://blocklistproject.github.io/Lists/alt-version/torrent-nl.txt https://blocklistproject.github.io/Lists/alt-version/tracking-nl.txt https://blocklistproject.github.io/Lists/alt-version/whatsapp-nl.txt https://raw.githubusercontent.com/Clefspeare13/pornhosts/master/0.0.0.0/hosts https://raw.githubusercontent.com/T145/the-blacklist/user-submissions/blacklist_domain.txt https://blocklist.cyberthreatcoalition.org/vetted/domain.txt https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt https://block.energized.pro/extensions/ips/formats/list.txt https://block.energized.pro/extensions/regional/formats/domains.txt https://block.energized.pro/extensions/social/formats/domains.txt https://block.energized.pro/unified/formats/domains.txt https://block.energized.pro/extensions/xtreme/formats/domains.txt https://raw.githubusercontent.com/Perflyst/PiHoleBlocklist/master/AmazonFireTV.txt https://raw.githubusercontent.com/KodoPengin/GameIndustry-hosts-Template/master/Main-Template/hosts https://raw.githubusercontent.com/dupontjean/pihole-blocklist/master/game.txt https://hostfiles.frogeye.fr/firstparty-only-trackers.txt https://hostfiles.frogeye.fr/multiparty-only-trackers.txt https://raw.githubusercontent.com/jakejarvis/ios-trackers/master/blocklist.txt https://kriskintel.com/feeds/ktip_covid_domains.txt https://kriskintel.com/feeds/ktip_malicious_domains.txt https://raw.githubusercontent.com/lightswitch05/hosts/master/docs/lists/amp-hosts-extended.txt https://www.github.developerdan.com/hosts/lists/amp-hosts-extended.txt https://raw.githubusercontent.com/lightswitch05/hosts/master/docs/lists/facebook-extended.txt https://www.github.developerdan.com/hosts/lists/facebook-extended.txt https://raw.githubusercontent.com/lightswitch05/hosts/master/docs/lists/hate-and-junk-extended.txt https://www.github.developerdan.com/hosts/lists/hate-and-junk-extended.txt http://phishing.mailscanner.info/phishing.bad.sites.conf https://raw.githubusercontent.com/stamparm/aux/master/maltrail-malware-domains.txt https://raw.githubusercontent.com/marktron/fakenews/master/fakenews https://raw.githubusercontent.com/StevenBlack/hosts/master/extensions/fakenews/hosts https://raw.githubusercontent.com/matomo-org/referrer-spam-list/master/spammers.txt https://raw.githubusercontent.com/notracking/hosts-blocklists/master/dnscrypt-proxy/dnscrypt-proxy.blacklist.txt https://dbl.oisd.nl/extra/ https://dbl.oisd.nl/ https://phishing.army/download/phishing_army_blocklist_extended.txt https://raw.githubusercontent.com/finnish-easylist-addition/finnish-easylist-addition/master/Finland_adb.txt https://raw.githubusercontent.com/PolishFiltersTeam/KADhosts/master/KADhosts.txt https://raw.githubusercontent.com/MajkiIT/polish-ads-filter/master/polish-pihole-filters/hostfile.txt https://raw.githubusercontent.com/lassekongo83/Frellwits-filter-lists/master/Frellwits-Swedish-Hosts-File.txt https://rescure.me/covid.txt https://rescure.me/rescure_domain_blacklist.txt https://raw.githubusercontent.com/StevenBlack/hosts/master/extensions/gambling/hosts https://raw.githubusercontent.com/StevenBlack/hosts/master/extensions/social/sinfonietta/hosts https://raw.githubusercontent.com/Perflyst/PiHoleBlocklist/master/SmartTV.txt https://v.firebog.net/hosts/static/w3kbl.txt https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/hosts/extra.txt https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/domains.list https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/hosts.list https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/mobile.list https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/snuff.list https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/wildcard.list https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/wildcard.rpz-nsdname.list

Old/Defunct: https://raw.githubusercontent.com/4skinSkywalker/Anti-Porn-HOSTS-File/master/HOSTS.txt (No updates since Feb 2021) https://getblackbird.net/blacklist/hosts.txt (No updates since Oct 2020) https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/gambling-hosts (No updates since January 2020) https://raw.githubusercontent.com/Sinfonietta/hostfiles/master/social-hosts (No updates since June 2019) https://raw.githubusercontent.com/tiuxo/hosts/master/porn https://raw.githubusercontent.com/StevenBlack/hosts/master/extensions/porn/tiuxo/hosts (No updates since December 2020)

Not compatible with Pihole: https://data.netlab.360.com/feeds/dga/dga.txt https://feodotracker.abuse.ch/downloads/ipblocklist_aggressive.csv https://sslbl.abuse.ch/blacklist/sslipblacklist_aggressive.csv https://urlhaus.abuse.ch/downloads/text/ https://isc.sans.edu/api/threatlist/adscore?json https://isc.sans.edu/api/threatlist/alphastrike?json https://isc.sans.edu/api/threatlist/arbor?json https://www.binarydefense.com/banlist.txt https://isc.sans.edu/api/threatlist/blindferret?json https://lists.blocklist.de/lists/all.txt https://www.blocklist.de/downloads/export-ips_all.txt http://danger.rulez.sk/projects/bruteforceblocker/blist.php https://isc.sans.edu/api/threatlist/censys?json http://charles.the-haleys.org/ssh_dico_attack_hdeny_format.php/hostsdeny.txt https://isc.sans.edu/api/threatlist/ciarmy?json https://cinsscore.com/list/ci-badguys.txt http://www.ciarmy.com/list/ci-badguys.txt https://cybercrime-tracker.net/all.php https://isc.sans.edu/api/threatlist/cybergreen?json https://www.darklist.de/raw.php https://osint.digitalside.it/Threat-Intel/lists/latestips.txt https://rules.emergingthreats.net/fwrules/emerging-Block-IPs.txt https://rules.emergingthreats.net/blockrules/compromised-ips.txt https://isc.sans.edu/api/threatlist/erratasec?json https://raw.githubusercontent.com/firehol/blocklist-ipsets/master/firehol_level4.netset https://isc.sans.edu/api/threatlist/internetcensus?json https://isc.sans.edu/api/threatlist/ipip?json https://kriskintel.com/feeds/ktip_malicious_Ips.txt https://kriskintel.com/feeds/ktip_ransomware_feeds.txt http://malc0de.com/bl/BOOT https://myip.ms/files/blacklist/general/full_blacklist_database.zip https://isc.sans.edu/api/threatlist/onyphe?json https://openphish.com/feed.txt https://isc.sans.edu/api/threatlist/rapid7sonar?json https://isc.sans.edu/api/threatlist/recyber?json https://easylist-downloads.adblockplus.org/abpindo+easylist.txt https://raw.githubusercontent.com/List-KR/List-KR/master/filter.txt https://rescure.me/malware/clop.txt https://rescure.me/malware/ekans.txt https://rescure.me/rescure_blacklist.txt https://rescure.me/malware/maze.txt https://rescure.me/malware/netwalker.txt https://rescure.me/malware/ryuk.txt https://rescure.me/malware/sodinokibi.txt https://rescure.me/malware/try2cry.txt https://rescure.me/malware/wastedlocker.txt https://report.rutgers.edu/DROP/attackers https://isc.sans.edu/api/threatlist/scorecard?json https://www.shallalist.de/Downloads/shallalist.tar.gz https://isc.sans.edu/api/threatlist/shodan?json https://isc.sans.edu/api/threatlist/stretchoid?json https://threatcrowd.org/inc/blacklist.txt https://threatcrowd.org/feeds/ips.txt https://www.threatsourcing.com/dnall-free.txt https://www.threatsourcing.com/ipall-free.txt https://isc.sans.edu/api/threatlist/univmichigan?json https://isc.sans.edu/api/threatlist/univsydney?json https://router.unrealsec.eu/badips https://dsi.ut-capitole.fr/blacklists/download/blacklists.tar.gz https://tracker.h3x.eu/api/sites_1week.php https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/rpz-ip http://vxvault.net/URL_List.php

Duplicate and uneeded: https://hosts.netlify.app/Xtra/domains.txt (badmojr) https://adlist.vercel.app/Xtra/domains.txt (badmojr) https://cdn.jsdelivr.net/gh/badmojr/1Hosts@latest/Xtra/domains.txt (badmojr) https://o0.pages.dev/Xtra/domains.txt (badmojr) https://hosts.anudeep.me/mirror/facebook.txt (anudeep) https://raw.githubusercontent.com/StevenBlack/hosts/master/extensions/porn/clefspeare13/hosts (clefspeare13)

Unable to verify list quality: https://infosharing.cybersaiyan.it/feeds/CS-PIHOLE https://filtri-dns.ga/filtri.txt https://myip.ms/files/bots/live_webcrawlers.txt (did not load) https://isc.sans.edu/api/threatlist/netsystems?json https://orca.pet/notonmyshift/domains.txt http://taz.net.au/Mail/SpamDomains https://threatcrowd.org/feeds/domains.txt https://assets.windscribe.com/custom_blocklists/ads.txt https://assets.windscribe.com/custom_blocklists/porn.txt https://assets.windscribe.com/custom_blocklists/clickbait.txt https://assets.windscribe.com/custom_blocklists/gambling.txt https://assets.windscribe.com/custom_blocklists/iot.txt https://assets.windscribe.com/custom_blocklists/malware.txt (Unable to load) https://raw.githubusercontent.com/crazy-max/WindowsSpyBlocker/master/data/hosts/update.txt (Why block windows from updating?) https://paste.cryptolaemus.com/feed.xml https://paste.cryptolaemus.com/dbucket/ (Both do not load)

T145 commented 3 years ago

"Will take me days" - lol

I'm guessing you used a script to do this, which I'd be interested in seeing. Measuring list redundancy is on my TODO list.

Regarding the findings:

pallebone commented 3 years ago

Actually I checked each list manually by loading it in firefox and checking the dates etc :)

Also I noticed a spelling error on the last working entry. It should be https://raw.githubusercontent.com/mypdns/porn-records/master/submit_here/adult.mypdns.cloud/wildcard.rpz-nsdname

Old lists - Sure you can use old lists. Im just flagging that up :)

Unloading lists - Not sure why I cannot load them. Maybe a temporary issue?

Win10 blocking - No worries. Understood :)

Thanks again :)

T145 commented 3 years ago

Np; I've fixed the rpz source. It seems any source that contains raw domain filters exclusively are compatible. I may be able to tweak the readme update script to just print out a single line containing source urls that work since the pretty gist rendering doesn't work on GH READMEs for w/e reason. I'll update the README w/ your findings and consider this closed.

Btw if you'd like to open an issue to review https://github.com/pallebone/PersonalPiholeListsPAllebone I could add in any good sources.

pallebone commented 3 years ago

No problem. Thank you for your time.

The lists on that page are not maintained by me, I just found people who maintain lists and tried them out and any good ones I list on that github page so that others can also try them thats all :)

Hope that makes sense.

If I misunderstood what you are asking please let me know. Happy to be corrected if I didnt understand :)

Pete

T145 commented 3 years ago

I just mean that those sources can be reviewed and added if found to be unique. As you can see I have some similar issues opened already.

I believe any source that contains a single domain filter using the mawk engine is Pihole compatible, so I'll write a release step to generate a pihole-compatible text file at some point.

Feel free to drop a ⭐ btw 😉

pallebone commented 3 years ago

Ah sorry. Ok. Yes some are unique. I would have to check through.

I have added a star :)

T145 commented 3 years ago

Your list will drop on the next build under sources.pihole. Enjoy!