Closed devipasigner closed 1 year ago
.lol .one .online
^ From yokoffing filterlists
.email .sex .sexy .recipes .xxx .yandex .zone .software
wouldn't .xxx have too many false positives?
Thank you, I will check later which ones lead to many false positives and which ones do not.
*.xxx is probably more for the personal block list.
Result of the occurrence of the TLDs on the Umbrella Toplist (contains no malicious doamins):
TLD | on toplist |
---|---|
sexy | 3 |
recipes | 8 |
yandex | 13 |
ci | 20 |
sex | 35 |
agency | 50 |
software | 54 |
win | 157 |
163 | |
xxx | 177 |
lol | 220 |
zone | 223 |
shop | 338 |
fun | 351 |
one | 558 |
live | 904 |
online | 981 |
link | 1241 |
I think one, live, online, and link
should not be added.
@devipasigner @bestplayerbot: What do you think?
Have already added the "safe" ones:
|*.sexy^
|*.recipes^
|*.ci^
|*.sex^
|*.agency^
|*.software^
|*.win^
|*.email^
Thank you @hagezi
almost all but 2 of the tlds excluded are from @yokoffing I haven't experienced any false positives and it has been semi effective but maybe it would be best to get an answer from @yokoffing
I've experienced some false positives with .link and .fun
Result of the occurrence of the TLDs on the Umbrella Toplist
@hagezi This is helpful! Thank you.
@devipasigner You can see known false positives in my filter list version of TLD protection and in Dandelion's malware list
I don't recommend blocking all these TLDs at the DNS level, but I have them there for folks who would rather tinker with setup on a regular basis. It's also why the TLD list in my NextDNS repo is split in two, whereas the filter list will warn you first but still allow the user to navigate to the site.
one, live, online, and link
should not be added.
Spamhaus says one
is 3.6% bad and online
2.6% bad, whereas live
is 25% bad. link
is 12.5%. (For reference, .com
is 1.7%.)
one
and online
from hardblocking at the DNS-level.live
as a hardblock, but there are known false positiveslink
out of hardblocking but I may leave on a filterlist to warn the user. But I favor towards removal: it's starting to be used more.@devipasigner
.email .sex .sexy .recipes .xxx .yandex .zone .software
Spamhaus is not everything, but one tool to take into consideration:
TLD | Spamhaus % Bad Domains |
---|---|
com (reference) | 1.7% |
org (reference) | 1.2% |
4% | |
sex | 0% |
sexy | 0% |
recipes | 0% |
xxx | 0% |
yandex | 0% |
zone | 11.1% |
software | 5.8% |
I went through my filterlist and purged TLDs that are less than 10% bad according to Spamhaus https://github.com/yokoffing/filterlists/pull/39/commits/5d980ece1d4c936c6108acd848941e49f7c7b981. Here's what was left:
||asia^$doc
||beauty^$doc
||cn^$doc
||degree^$doc
||fit^$doc
||fyi^$doc
||garden^$doc
||live^$doc,domain=~marcello.live|~notgoogle.live
||quest^$doc
||su^$doc,domain=~kaihat.su
||shop^$doc,domain=~nsverify.shop
||surf^$doc
||zone^$doc
This cuts my list down from 43 entries to 13 entries. Note the exceptions for live, su,
and shop
. These may not be suitable for DNS-level blocking.
My filterlist doesn't include Dandelion's list.
The Dandelion TLDs that don't have a bunch of exceptions are:
||agency^$doc,domain=~battlefield.agency|~shortcut.agency
||bid^$doc
||cfd^$doc
||discount^$doc
||gdn^$doc
||loan^$doc
||ooo^$doc
||sbs^$doc
Note the exceptions for agency
. These may not be suitable for DNS-level blocking. Some of Dandelion's TLDs fall under 10% abused; but he has researched sites beyond this basic rubric I've imposed, so I'd keep his entries.
Here's the combined list of what's left from hagezi + mine + Dandelion's. We could use this merged list as the foundation for TLD blocking (with or without the four TLDs below with site exceptions), then introduce other metrics/research to justify what other TLDs should be blocked.
||agency^$doc,domain=~battlefield.agency|~shortcut.agency
||asia^$doc
||beauty^$doc
||bid^$doc
||cfd^$doc
||cn^$doc
||degree^$doc
||discount^$doc
||fit^$doc
||fyi^$doc
||garden^$doc
||gdn^$doc
||live^$doc,domain=~marcello.live|~notgoogle.live
||loan^$doc
||ooo^$doc
||quest^$doc
||sbs^$doc
||shop^$doc,domain=~nsverify.shop
||su^$doc,domain=~kaihat.su
||surf^$doc
||zone^$doc
The following then need justification:
||associates^$doc
||bar^$doc
||best^$doc
||buzz^$doc
||cam^$doc,domain=~halide.cam
||casa^$doc
||ci^$doc
||cricket^$doc
||cyou^$doc
||date^$doc
||fun^$doc,domain=~libgen.fun|~gaggle.fun|~neal.fun
||icu^$doc
||kp^$doc
||link^$doc,domain=~reddit.app.link|~unlocked.link
||loans^$doc
||lol^$doc,domain=~kissanime.lol|~url.lol
||one^$doc,domain=~ablaze.one
||online^$doc,domain=~ero-labs.online
||recipes^$doc
||rest^$doc
||review^$doc
||ru^$doc,domain=~aliexpress.ru|~yandex.ru
||sex^$doc
||sexy^$doc
||software^$doc
||tokyo^$doc
||wang^$doc
||webcam^$doc
||win^$doc
||work^$doc,domain=~searx.work
||xxx^$doc
We could rule out three potentially (https://github.com/DandelionSprout/adfilt/issues/659#issuecomment-1284845803):
.associates
: A fair bit of use among US law firms. Can't be blocked..rest
: Oddly appears to be used by some restaurants. Can't be blocked at the moment..webcam
: Rare cases of use by European road services.
@devipasigner
.email .sex .sexy .recipes .xxx .yandex .zone .software
Spamhaus is not everything, but one tool to take into consideration:
TLD Spamhaus % Bad Domains com (reference) 1.7% org (reference) 1.2% email 4% sex 0% sexy 0% recipes 0% xxx 0% yandex 0% zone 11.1% software 5.8%
Awesome, there are a few entries that are constantly being used to spam my emails with phishing that need to be added though. I will gather them and send to review
@yokoffing Thanks for your work. Exceptions cannot be handled in DNS rules. I would then have to unblock the corresponding domain with all subdomains.
Maybe we should take the list of @yokoffing as a common list. I would then parse it and convert it to |*.tld^
to fit the rules for DNS. However, I would not like to lock out whole countries like cn
and ru
.
Exceptions cannot be handled in DNS rules. I would then have to unblock the corresponding domain with all subdomains.
@hagezi Precisely. We need to be very selective as to what get blocked at the DNS-level.
A filterlist can provide a warning before navigation and still allow navigation in uBlock Origin (not sure about AdGuard); hard-blocking at the DNS-level can't. User-friendliness dictates that we are more relaxed at the DNS level (your list) and possibly(!) stricter at adblock/filterlist level (my filterlist + Dandelion's malware list). So, as far as those four TLDs with exceptions/false positives in the combined list, I wouldn't include at all at the DNS level, personally. But that's your call.
I would not like to lock out whole countries like
cn
andru
.
They should be removed at the DNS-level. Users can add to their personal list if they want to block them.
asia
is in the combined list and is used by very few sites. 10.8% bad. I'll keep that one.cn
on my filterlist. I haven't had any complaints so far, but I will allowlist it for now.Referencing the combined list from earlier: if we remove those four TLDs that had exceptions, and remove cn
, that leaves us with:
||asia^$doc
||beauty^$doc
||bid^$doc
||cfd^$doc
||degree^$doc
||discount^$doc
||fit^$doc
||fyi^$doc
||garden^$doc
||gdn^$doc
||loan^$doc
||ooo^$doc
||quest^$doc
||sbs^$doc
||surf^$doc
||zone^$doc
I left asia
in there for now. We might should allow it at the DNS level and I keep it blocked at the filterlist level. Let me know your thoughts.
Once we agree on a common list, we can look at the 'needs justification' list.
For anyone wondering: I haven't forgot about my NextDNS repo. I just want to wait and clean that up after we get this sorted.
Went through my emails with over 500 pages of spam and phishing and the most abused by far are:
.ml
.sbs
.cf
.site
.online
.gq
.ga
.tk
.top
.in
.fun (a bunch of nsfw phishing redirects from google docs pdf)
.xyz (too many false positives)
Personally I think these must be included
@devipasigner Thanks for doing that!
.sbs
is already included in our combined list above.
.top
Too many false positives and probably more: https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L24.ml
https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L15-L16.tk
https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L11-L12.online
One false positive noted but there may be others after we research further. I do recall there being a phishing used with this TLD last year. It is used legitimately for manga and video streaming sites, so I wouldn't block it entirely..fun
also has legitimate uses https://github.com/yokoffing/filterlists/blob/8cd755ed08bd300222b37d6f8b71b492fe6eae36/enhanced_site_protection.txt#L25.cf
, .ga
, .gq
are on my hardened NextDNS list and they break third-party video streaming sites and have other unintended breakage (subrequests on pages). https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L14-L20.site
has legitimate uses (Google, Fmovies, sports, etc.).in
is the domain for India. @hagezi doesn't want to block country domainsLet me know your thoughts.
I moved ~.online
~ .fun
back to my filterlist to soft-block, and added .in
as a new entry. .cf
, .ga
, .gq
are blocked by Dandelion's Malware list and have exceptions already there. I wouldn't block these at DNS-level either, though.
Hello all! Just tuning in here to share some ideas, perhaps there could be 2 versions of the list, a strict and more balanced list, kind of like the system @yokoffing already has going on, but perhaps more optimized. They would probably be called 'light' and 'pro'. Why? Sure there may be some false positives caused by these tlds but thats the nature of tld and regex blocking.. of course we should still offer a balanced one. But some of the ones that have a few false positives are sites not many people will probably ever go on but will block a lot of the phishing sites. More good than bad. Personally, I've never got a hit with any of the 'balanced' ones, but got many hits with the ones reported here that have a minuscule amount of false positives.
@devipasigner Thanks for doing that!
Good
.sbs
is already included in our combined list above.Too many false positives
* `.top` Too many false positives and probably more: https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L24 * same for `.ml` https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L15-L16 * and `.tk` https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L11-L12
Possibly appropriate for soft-blocking by filter list, but not hard-blocking via DNS:
* `.online` One false positive noted but there may be others after we research further. I do recall there being a phishing used with this TLD last year. It is used legitimately for manga and video streaming sites, so I wouldn't block it entirely. * `.fun` also has legitimate uses https://github.com/yokoffing/filterlists/blob/8cd755ed08bd300222b37d6f8b71b492fe6eae36/enhanced_site_protection.txt#L25 * `.cf`, `.ga`, `.gq` are on my hardened NextDNS list and they break third-party video streaming sites and have other unintended breakage (subrequests on pages). https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L14-L20
Probably not appropriate but unsure
* `.site` has legitimate uses (Google, Fmovies, sports, etc.) * `.in` is the domain for India. @hagezi doesn't want to block country domains
Let me know your thoughts.
I moved
.online
.fun
back to my filterlist to soft-block, and added.in
as a new entry. https://github.com/yokoffing/filterlists/blob/0cf52963af4ae322771d30620bb3542b1c7813c2/enhanced_site_protection.txt
.cf
,.ga
,.gq
are blocked by Dandelion's Malware list and have exceptions already there. I wouldn't block these at DNS-level either, though.
Hey! I'm just joining in , but from what I see a lot of those listed for only soft blocking have very minimal false positives. Lots of those are used for malware redirects especially in emails and streaming sites and I think it is needed. Now like I wrote above, the 2 solutions would be: 1. Make 2 different lists, one light, one 'aggressive' but I wouldn't even call it that. 2. Make one big list with false positives constantly whitelisted using top lists and dandelions small selection already made
@devipasigner Thanks for doing that!
Good
.sbs
is already included in our combined list above.Too many false positives
* `.top` Too many false positives and probably more: https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L24 * same for `.ml` https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L15-L16 * and `.tk` https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L11-L12
Possibly appropriate for soft-blocking by filter list, but not hard-blocking via DNS:
* `.online` One false positive noted but there may be others after we research further. I do recall there being a phishing used with this TLD last year. It is used legitimately for manga and video streaming sites, so I wouldn't block it entirely. * `.fun` also has legitimate uses https://github.com/yokoffing/filterlists/blob/8cd755ed08bd300222b37d6f8b71b492fe6eae36/enhanced_site_protection.txt#L25 * `.cf`, `.ga`, `.gq` are on my hardened NextDNS list and they break third-party video streaming sites and have other unintended breakage (subrequests on pages). https://github.com/DandelionSprout/adfilt/blob/2cbcfff6e62f6fff817a64390810188ab8903c08/Dandelion%20Sprout's%20Anti-Malware%20List.txt#L14-L20
@yokoffing thank you for the feedback, personally I think these entries are very much needed despite having a couple of false positives. There are millions of phishing domains using these TLD's and only a couple of legitimate domains using these TLDs that a regular user will ever visit. I think theres just way too many malware and phishing domains using these TLDs to allow it, even if it means whitelisting 1/2 domains. Let me know what you think, I simply think the good extremely outweights the bad. @Dynasty-Dev proposed are pretty good solution however I really don't think they would be too far from each other in terms of false positives/aggressiveness.
- Make 2 different lists, one light, one 'aggressive' but I wouldn't even call it that.
- Make one big list with false positives constantly whitelisted using top lists and dandelions small selection already made
@Dynasty-Dev That's already happening at the filterlist level. But this cannot be done at the DNS-level. You have to manually allowlist all the false positives -- which is fine for a personal setup, but impractical when you're making a list for many people. You're going to run into breakage.
personally I think these entries are very much needed despite having a couple of false positives. There are millions of phishing domains using these TLD's and only a couple of legitimate domains using these TLDs that a regular user will ever visit.
@devipasigner I just want to avoid a scenario where someone is having to allowlist something every week. I may make some concessions and get feedback from users over time. Let me think on it.
Adblocking side isn't too bad with uBlock Origin since you can bypass the block and report the false positive later. It's DNS blocking that can be very frustrating.
So friends, now you've hung up on me - I'm old! :D Before I now no longer know what remains on the TLD list and what not, I'll sleep a night over it. :)
- Make 2 different lists, one light, one 'aggressive' but I wouldn't even call it that. 2. Make one big list with false positives constantly whitelisted using top lists and dandelions small selection already made
@Dynasty-Dev That's already happening at the filterlist level. But this cannot be done at the DNS-level. You have to manually allowlist all the false positives -- which is fine for a personal setup, but impractical when you're making a list for many people. You're going to run into breakage.
personally I think these entries are very much needed despite having a couple of false positives. There are millions of phishing domains using these TLD's and only a couple of legitimate domains using these TLDs that a regular user will ever visit.
@devipasigner I just want to avoid a scenario where someone is having to allowlist something every week. I may make some concessions and get feedback from users over time. Let me think on it.
Adblocking side isn't too bad with uBlock Origin since you can bypass the block and report the false positive later. It's DNS blocking that can be very frustrating.
I understand! And I respect and love all the work you've done. Almost every NextDNS user has a well balanced profile because of you. (i personally use it too). However I think the amount of false positives from these TLDs is being very exaggerated. Ive personally used Hagezi's tld blocklist plus my own entries for 2 different households for over a year (own tld entries before hagezi and you existed) and I haven't gotten any false positive reports from them (only from blocklists). I personally use a lot of those free streaming sites for movies and sports (nba, football) too. As for using browser extensions, I agree, they should be applied everywhere however there are censorious where it won't help (for example, email apps on mobile (huge source of phishing and spam). Now the average techie will know its a scam, however the same can't be said about kids, wifes, grandparents
I think theres just way too many malware and phishing domains using these TLDs to allow it, even if it means whitelisting 1/2 domains.
@devipasigner That sounds amazing --- until you're the list maintainer 🙃
Now the average techie will know its a scam, however the same can't be said about kids, wifes, grandparents
@Dynasty-Dev Those are the ones that are the loudest when a random site doesn't work
Went through my emails with over 500 pages of spam and phishing and the most abused by far are:
.ml .sbs .cf .site .online .gq .ga .tk .top .in .fun (a bunch of nsfw phishing redirects from google docs pdf)
We'll give it a go.
I have added/restored the ones listed above that not already in Dandelion's list, with what exceptions that I'm aware of https://github.com/yokoffing/filterlists/pull/39/commits/168e83d6fe1797d693f582beb600d83348f435b9. Edit: Accidentally added .top
and now removed, since Dandelion actively covers that https://github.com/yokoffing/filterlists/pull/39/commits/a02e70dc0a0317514271384f3c5c5947cf69a51b
This will block top-site navigations and not break sub-requests. Unlike blocking these at the DNS level, I don't anticipate many false positives.
Pull Request: https://github.com/yokoffing/filterlists/pull/39 yokoffing Enhanced Protection List: https://github.com/yokoffing/filterlists/blob/main/enhanced_site_protection.txt Dandelion Sprout's Anti-Malware List: https://github.com/DandelionSprout/adfilt/blob/master/Dandelion%20Sprout's%20Anti-Malware%20List.txt
When we have the space to discuss the rest of these domains, I'll push the pull request through. Keep an eye on it until then.
found a false postive for *.fun yesterday https://github.com/yokoffing/filterlists/pull/39#issuecomment-1371689096
So, I might do the following with my list:
I will take the top 10 Spamhaus TLDs as before, meaning every TLD that has been in the top 10 since the list was created will end up on the list. Exceptions below.
I additionally take over the TLDs from yokoffing by parsing his list. I will implement the exceptions with the denyallow
modifier, example: |*.agency^$denyallow=battlefield.agency|shortcut.agency
Furthermore TLDs can be added manually of course.
I will exclude the following TLDs from the overall list:
Country specific TLDs, like:
cn
ru
in
co
uk
de
Other TLDs:
info
com
net
org
io
me
xyz
What do you think about such a solution?
Looks good. Here's what the 'combined list' looks like now (mine + Dande + @devipasigner's research on what should be restored), with comments. Please review: https://github.com/yokoffing/filterlists/blob/cd71a3ce16cad7717880394d2fa9e7d41711aa26/enhanced_site_protection.txt#L9-L62
@hagezi Now, here's the combined list with:
1) comment lines removed
2) cn
ru
in
removed
3) alphabatized
||agency^$doc,domain=~battlefield.agency|~shortcut.agency
||asia^$doc
||beauty^$doc
||bid^$doc
||cf^$doc,domain=~google.cf|~rths.cf|~voitures.cf|~assembleenationale-rca.cf|~cps-rca.cf|~acap.cf|~miraculousladybug.cf|~scrat.cf
||cfd^$doc
||degree^$doc
||discount^$doc
||fit^$doc
||fun^$doc,domain=~libgen.fun|~gaggle.fun|~neal.fun|~bestgore.fun
||fyi^$doc
||ga^$doc,domain=~google.ga|~filtri-dns.ga|~dgdi.ga|~voitures.ga|~economie-gabon.ga|~9191.ga|~animevsub.ga
||garden^$doc
||gdn^$doc
||gq^$doc,domain=~deimos.gq|~inege.gq|~tvgelive.gq|~comprarcarros.gq
||live^$doc,domain=~marcello.live
||loan^$doc
||ml^$doc,domain=~google.ml|~mobili.ml|~melody.ml|~dcod.ml|~info-matin.ml|~amap.ml|~mastodon.ml|~worproject.ml|~nothingprivate.ml|~lingva.ml|~lemmy.ml|~bittor.ml|~noic.ml|~beatbump.ml|~gymlibrary.ml|~animevsub.ml|~prompt.ml|~biblioreads.ml
||monster^$doc,domain=~egybest.monster|~yts.monster|~cloudcdn.monster|~fedi.monster
||online^$doc,domain=~ero-labs.online|~amedia.online|~allhen.online|~chainsaw-man-manga.online|~manga1st.online
||ooo^$doc
||pw^$doc,domain=~libgen.pw|~petridish.pw|~palaugov.pw|~dpc.pw|~zikrap.pw|~demonoid.pw|~bittor.pw|~buttercup.pw|~rezka.pw|~darkcrystal.pw|~xor.pw|~fullhdfilmizlesene.pw|~b00k.pw|~gopass.pw|~vost.pw
||quest^$doc
||sbs^$doc
||shop^$doc,domain=~nsverify.shop
||site^$doc,domain=~business.site|~anitube.site|~wuxiaworld.site|~notube.site|~fmoviesto.site|~secreto.site|~metrolagu.site|~betphoenix.site|~cdmstudy.site
||su^$doc,domain=~kaihat.su
||surf^$doc
||tk^$doc,domain=~coolcmd.tk|~budterence.tk|~google.tk|~transportnews.tk|~c0d3c.tk|~anonytext.tk|~tokelau-info.tk|~fakaofo.tk|~loljp-wiki.tk|~ninetail.tk|~goshujin.tk|~graph.tk|~dls2.pokeacer.tk|~nolfrevival.tk|~coppersurfer.tk|~restricted-functions.tk|~bstweaker.tk|~nbd-media.tk|~glypx-pakhsh-nakon.tk|~gotofap.tk|~somepythonthings.tk
||top^$doc,domain=~corriente.top|~gdtot.top|~nicenature.top|~reminder.top|~magocoro.top|~castlevania.top|~suiten.top|~shucks.top|~1stream.top|~ambr.top|~techblog.top|~changlam10.top|~changlam11.top|~pdcdn1.top|~mastodon.top|~pressplay.top
||zone^$doc
Thanks @yokoffing, looks good!
DNS Syntax AdGuard Home, add to custom filtering rules for testing:
|*.agency^$denyallow=battlefield.agency|shortcut.agency
|*.asia^
|*.beauty^
|*.bid^
|*.cf^$denyallow=google.cf|rths.cf|voitures.cf|assembleenationale-rca.cf|cps-rca.cf|acap.cf|miraculousladybug.cf|scrat.cf
|*.cfd^
|*.degree^
|*.discount^
|*.fit^
|*.fun^$denyallow=libgen.fun|gaggle.fun|neal.fun|bestgore.fun
|*.fyi^
|*.ga^$denyallow=google.ga|filtri-dns.ga|dgdi.ga|voitures.ga|economie-gabon.ga|9191.ga|animevsub.ga
|*.garden^
|*.gdn^
|*.gq^$denyallow=deimos.gq|inege.gq|tvgelive.gq|comprarcarros.gq
|*.live^$denyallow=marcello.live
|*.loan^
|*.ml^$denyallow=google.ml|mobili.ml|melody.ml|dcod.ml|info-matin.ml|amap.ml|mastodon.ml|worproject.ml|nothingprivate.ml|lingva.ml|lemmy.ml|bittor.ml|noic.ml|beatbump.ml|gymlibrary.ml|animevsub.ml|prompt.ml|biblioreads.ml
|*.monster^$denyallow=egybest.monster|yts.monster|cloudcdn.monster|fedi.monster
|*.online^$denyallow=ero-labs.online|amedia.online|allhen.online|chainsaw-man-manga.online|manga1st.online
|*.ooo^
|*.pw^$denyallow=libgen.pw|petridish.pw|palaugov.pw|dpc.pw|zikrap.pw|demonoid.pw|bittor.pw|buttercup.pw|rezka.pw|darkcrystal.pw|xor.pw|fullhdfilmizlesene.pw|b00k.pw|gopass.pw|vost.pw
|*.quest^
|*.sbs^
|*.shop^$denyallow=nsverify.shop
|*.site^$denyallow=business.site|anitube.site|wuxiaworld.site|notube.site|fmoviesto.site|secreto.site|metrolagu.site|betphoenix.site|cdmstudy.site
|*.su^$denyallow=kaihat.su
|*.surf^
|*.tk^$denyallow=coolcmd.tk|budterence.tk|google.tk|transportnews.tk|c0d3c.tk|anonytext.tk|tokelau-info.tk|fakaofo.tk|loljp-wiki.tk|ninetail.tk|goshujin.tk|graph.tk|dls2.pokeacer.tk|nolfrevival.tk|coppersurfer.tk|restricted-functions.tk|bstweaker.tk|nbd-media.tk|glypx-pakhsh-nakon.tk|gotofap.tk|somepythonthings.tk
|*.top^$denyallow=corriente.top|gdtot.top|nicenature.top|reminder.top|magocoro.top|castlevania.top|suiten.top|shucks.top|1stream.top|ambr.top|techblog.top|changlam10.top|changlam11.top|pdcdn1.top|mastodon.top|pressplay.top
|*.zone^
@yokoffing new Exception for pw
: core.pw
Exception for
pw
:core.pw
@hagezi Thanks for the heads up. What is that? I can't get the site to load.
Done.pw
is in Dande's list. Open a pull request with him.
How do we feel about restoring .loans
(plural) to the list since we're already blocking .loan
(singular)?
@hagezi Thanks for the heads up. What is that? I can't get the site to load.
Needed for Wargaming Games
I've further refined the Spamhaus
section. The 10% rule was just a starting point. Spamhaus is not Bible, and they are not the "be all, end all" of the cybersecurity world but I like to reference their metrics. Lastly, note that this isn't touching the TLDs provided by Dandelion and @devipasigner.
I took into account whether these TLDs receive low or high traffic and whether the TLDs are used by few or many sites. Then I visited popular (if any were in the top 100k or 1M sites) and random sites to see firsthand if we should bother to keep them.
Total of 3 or 4 = stay Total of 1 or 2 = remove
TLD | Total | Spamhaus 10 Most Abused TLDs | Used by low-traffic or low-quality sites | Few Domains with TLD in existence | Many scammy rando sites (random searches) |
---|---|---|---|---|---|
asia | 0 | ||||
beauty | 4 | x | x | x | x |
degree | 3 | x | x | x | |
fit | 3 | x | x | x | |
fyi | 1 | x | |||
garden | 1 | x | |||
live | 1 | x | |||
quest | 3 | x | x | x | |
shop | 1 | x | |||
su | 1 | x | |||
surf | 2 | x | x | ||
zone | 3 | x | x | x |
.surf
is an exception to the rule: Spamhaus has it at 326/421 bad domains, so I'll take their word for it for now..garden
was only 0.9% bad on Spamhaus, and every website I checked looked fine. I must have missed that one in the original purge..shop
was only 8.9% bad on Spamhaus. I must have also missed that one in the original purge.@hagezi Therefore, we can remove the following:
asia
fyi
garden
live
shop
su
For the Spamhaus TLDS that remain, here's new false positives I encountered while checking these manually:
||beauty^$doc,domain=~vipbj.beauty
||degree^$doc,domain=~opf.degree|~three60.degree
||fit^$doc,domain=~appetit.fit|~clubb.fit|~pridegym.fit|~justget.fit
||quest^$doc,domain=~0x00.quest|~prednisonetab.quest
||surf^$doc,domain=~surfstation.surf|~kayaking.surf|~quran.surf|~s-wings.surf
||zone^$doc,domain=~typinggames.zone|~martech.zone|~kinogo.zone|~kidtopia.zone|~itsm.zone
Wow, great work!
why is *.kp blocked it's the TLD for north korea
This will no longer be the case in the new list. When @yokoffing is finished with his list, I will parse it daily and convert it into DNS format. I will exclude country-specific TLDs. I will also not add any entries manually, yokoffing's list will be the only basis.
New list is live; https://raw.githubusercontent.com/hagezi/dns-blocklists/main/adblock/spam-tlds.txt
Merged lists: Spamhaus Top10, @yokoffing @DandelionSprout
Excluded TLDs: .cn
.ru
.in
Please check ...
Yay, thanks! Just FYI:
|*.xxx^
Blocks some "legit" sites (i.e. https://www.virustotal.com/gui/url/d587c31002021e6c3c081f137f3a4b9dd28514501d37c26bec4ea58e38f7a1fe/detection). Not sites that I'd visit, but some people might complain.
@hagezi Run the script again. I just pushed a pull request from yesterday that comments out TLDs we haven't discussed (e.g., .icu
, and to @iam-py-test's point, .xxx
). The version you have did not do this.
This will provide us a nice core, foundation list 😄
Thanks @yokoffing, updated, Please check ...
https://github.com/hagezi/dns-blocklists/commit/9b5984e22debc48f0b903d80faf14c3302fd7509
@hagezi .yandex
is on there. Just found that one and removed from my list. That was my bad. Otherwise, the list is beautiful!
@iam-py-test Thanks for joining in! I've been wanting to ask you: Can you explain more on why you block .win
and .cricket
in your list? Thank you ahead of time for your insight.
Thanks @yokoffing, updated. Thanks for joining @iam-py-test!
@hagezi I know syntax will change based on format of your list (e..g, domain, hosts, etc.), but for the Adblock list, what is the reasoning to change from ||tld^$doc,domain=
to |*.tld^$denyallow=
? The latter can't be recognized by uBlock Origin.
@iam-py-test Thanks for joining in! I've been wanting to ask you: Can you explain more on why you block .win and .cricket in your list? Thank you ahead of time for your insight.
Honestly, that list is very poorly thought out. I just copied a ton of random TLDs from somewhere else (I don't remember where), and then over time removed some that had known FPs reported to other lists. I have put pretty much 0 work into that list, so I wouldn't put much creditably into it.
@hagezi I know syntax will change based on format of your list (e..g, domain, hosts, etc.), but for the Adblock list, what is the reasoning to change from ||tld^$doc,domain= to |*.tld^$denyallow=? The latter can't be recognized by uBlock Origin.
I think that might be because $denyallow requires $domain (https://github.com/DandelionSprout/adfilt/blob/master/Wiki/SyntaxMeaningsThatAreActuallyHumanReadable.md#blocking-1)
@yokoffing The reason was that the AdGuard host compiler ignores short rules. The recommended workaround from the AdGuard team was to switch to |*.TLD^
. I just wanted to make the list compatible with the host compiler, but I can also go back to the previous sytnax ||TLD^
. I have no problem with this.
Honestly, that list is very poorly thought out
@iam-py-test Thank you for your honesty! 🤣
@iam-py-test denyallow should also work with the normal syntax ||TLD^
:
https://github.com/AdguardTeam/AdGuardHome/wiki/Hosts-Blocklists#denyallow
@iam-py-test denyallow should also work with the normal syntax
||TLD^
: https://github.com/AdguardTeam/AdGuardHome/wiki/Hosts-Blocklists#denyallow
Not in uBlock Origin: https://github.com/gorhill/uBlock/wiki/Static-filter-syntax#denyallow Odd that uBo and AGH have different rules, although I guess it makes sense for a DNS blocker
$denyallow
in uBO requires being coupled with $domain
, for some reason.
Missing Abused TLDs from yokoffing nextdns
.agency .ci .fun .link .live .shop .win
Thank you, I have some other personal abused TLDs, I will send them for review once I am home