uBlockOrigin / uBlock-issues

This is the community-maintained issue tracker for uBlock Origin
https://github.com/gorhill/uBlock
883 stars 76 forks source link

Remove Malware Domain List #984

Closed Yuki2718 closed 4 years ago

Yuki2718 commented 4 years ago

Prerequisites

Description

gorhill promised the removal on a condition: https://www.wilderssecurity.com/threads/ublock-a-lean-and-fast-blocker.365273/page-163#post-2852075

Then just before the one year passes, MDL was updated at Jan. 22, 2020. However, this was just removing one domain from the list (I used WBM https://web.archive.org/web/20191006191919/https://www.malwaredomainlist.com/hostslist/hosts.txt): https://www.diffchecker.com/1tI1KbuZ and the list has not been updated for more than 2 months now.

uBlock-user commented 4 years ago

Are you suggesting it should be removed now ?

Yuki2718 commented 4 years ago

Are you suggesting it should be removed now ?

Yes!

gorhill commented 4 years ago

The list changed last January: https://github.com/uBlockOrigin/uAssets/commit/5ad7faaf9e0337e114ea151f5ef6bb5d6b8ccd8f#diff-53fe5504d16f8ca7c58a72186493bdef

Yuki2718 commented 4 years ago

The list changed last January: uBlockOrigin/uAssets@5ad7faa#diff-53fe5504d16f8ca7c58a72186493bdef

Yeah, I noted that. A question is whether removing one domain should be regarded as active maintenance, particularly when there has been no update for 2 months after that.

gwarser commented 4 years ago

Few people suggested to add https://gitlab.com/curben/urlhaus-filter to "Malware domains" section. For example https://www.reddit.com/r/uBlockOrigin/comments/e0z3zr/expanding_a_malware_domain_list/

jspenguin2017 commented 4 years ago

It's been well over a year since anything was added to the filter, although there are removals from time to time: https://github.com/NanoMeow/MDLMirror/blame/master/filter.txt

In fact, I don't believe anything was ever added to the filter since June 2018, when NanoMeow started mirroring the filter.

jspenguin2017 commented 4 years ago

@DandelionSprout suggested these filters on Slack:

https://www.dshield.org/feeds/suspiciousdomains_Medium.txt
https://raw.githubusercontent.com/stamparm/maltrail/master/trails/static/suspicious/pua.txt
https://gitlab.com/curben/urlhaus-filter/raw/master/urlhaus-filter-online.txt

Slack direct link (for those who have access): https://listauthorschat.slack.com/archives/C010N14G4TF/p1586368680366200

Yuki2718 commented 4 years ago

There are many lists for malware or phishing, say, https://hexxiumcreations.github.io/threat-list/hexxiumthreatlist.txt https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt https://phishing.army/download/phishing_army_blocklist.txt but my personal opinion is they are not catching up with the latest threat landscape where most of threats live in only few days. e.g. https://www.akamai.com/us/en/multimedia/documents/state-of-the-internet/soti-security-phishing-baiting-the-hook-report-2019.pdf

In fact, over a 60-day period, Akamai observed more than 2,064,053,300 unique domains commonly associated with malicious activity. Of those, 89% had a lifespan of less than 24 hours, and 94% had a lifespan of less than three days.

Given uBO's update frequency they are at most better-than-nothing.

DandelionSprout commented 4 years ago

I recall the "12 months without an update" criteria mentioned in the Wilders Security thread, was established back when Schacks Adblock Plus Liste was removed last year. At that point the idea was to weed out lists where the maintainers had disappeared from the face of the planet (which happens surprisingly often), and I didn't expect the mere concept of listmakers gaming it and resting too far back on their laurels.

Considering there's a whopping 468 anti-malware lists (562 with anti-phishing lists included) on Filterlists.com that are supported by uBO to whichever degree, surely at least a couple of them must meet the wishes of being up-to-date and thorough? Especially so since I've been advocating behind closed doors on Slack for the removal of DNS-BH Malware Domains for having too many false positives (Nordic construction company sites in particular).

DandelionSprout commented 4 years ago

If we are to believe https://www.malwaredomainlist.com/update.php at face value, then no new domains have been added to Malware Domain List since December 2017, which is normally a pretty bad sign.

gorhill commented 4 years ago

Given uBO's update frequency they are at most better-than-nothing.

Should there be any malware lists selected by default? It has been argued in the past that the browsers have probably better up to date malware list for their "block dangerous and deceptive content" feature.

DandelionSprout commented 4 years ago

I personally think it's pretty important for uBO to show off itself as "We can block malware too, not just ads or trackers". So yes, there should be at least 1 anti-malware/-phishing list enabled by default, if you ask me.

gorhill commented 4 years ago

So at this point it comes down to pick a list with the following qualities:

Yuki2718 commented 4 years ago

Should there be any malware lists selected by default? It has been argued in the past that the browsers have probably better up to date malware list for their "block dangerous and deceptive content" feature.

I say no. It has been years after I became tired of malware testing, but from my experience and also some others' reports e.g. https://malwaretips.com/threads/updated-29-12-2018-browser-extension-comparison-malwares-and-phishings.80915/ surely Google SafeBrowsing tends to be generally more effective than any downloadable lists - also note about 90% of the current list of Malware Domains are sourced from GSB. Yet two things to keep in mind are some people disable GSB for privacy concern, and even GSB has been beaten by bad actors - @krystian3w and I have occasionally seen such cases on AdGuardFilters repo. More and more reports claim that the number of bad domains are now tremendous and they are becoming more short-lived. I don't say malware lists are useless, as there are still some relatively long-lived domains slipped through GSB - we just need to throw an illusion away that they'll fairly protect user.

Many of these lists are maintained by small number of people and often have regional/language biases. So I think they shouldn't be enabled by default as they rarely or never come into play, more so if you keep GSB enabled or you happened to live in a certain region, but keeping some lists under malware section as options will be nice particularly for those who turned GSB off.

Yuki2718 commented 4 years ago

I personally think it's pretty important for uBO to show off itself as "We can block malware too, not just ads or trackers". So yes, there should be at least 1 anti-malware/-phishing list enabled by default, if you ask me.

I have a different opinion. Surely uBO can block many attacks IF default-deny mode which only geeks use is chosen, and even without that still can block malvertising which is diminishing (https://www.bleepingcomputer.com/news/security/almost-60-percent-of-malicious-ads-come-from-three-ad-providers/), but I think uBO should not be advertised as a security tool - that leads to e.g. people comparing uBO with security addons and talking about good and bad, noobs having false sense of security that they are safe because using uBO with malware filters, and possibly criticize by a security researcher with real-world data of malware and phishing like what I posted earlier. I believe 99% of user use uBO for ads or other annoyances so this won't affect its popularity.

gorhill commented 4 years ago

I think uBO should not be advertised as a security tool - that leads to e.g. people comparing uBO with security addons and talking about good and bad, noobs having false sense of security that they are safe because using uBO with malware filters

This is a good argument, and given this I am leaning toward not having any malware list selected by default. I would still like to provide good stock lists so picking a good list to replace MDL is still something I would like to do.

Yuki2718 commented 4 years ago

Quicly tested the lists mentioned. My suggestion is to add only URLhaus filter (once a problem is solved) or at most URLhaus + Phishing Army, unless someone knows better lists.

Details: every lists included some dead domains (yes, despite "online" version) and apparent false positives. But /trails/static/suspicious/pua.txt includes too many dead domains, and given PUA is not as serious as malware, I don't recommend it - we already have Badware risks so better to expand it if needed. Of note, some domains in the PUA list are included in AdGuard Base. I think DShield's list shouldn't be used because the author don't recommend: https://www.dshield.org/xml.html

Our data does include false positives, and we will not remove them. It would make it harder to observe long term trends. If a report is a false positive or not depends to a large extend on the question being asked.

We offer one blocklist, and one blocklist only (http://www.dshield.org /block.txt). Unlike for our other lists, we will remove IPs from this blocklist if asked to.

They're talking about IP lists here, but I guess the comment applies to domain lists too - anyway the list is for suspicious domains and not for confirmed bad ones. Hexxium's list was last updated on Feb. 26 so no update for more than a month, not a good sign for malware list. Also their list includes some domains with 0 detection on VirusToal, not sure if they're FP or FN by VT.

URLhaus online filter doesn't overlap with current Malware domains, tho it won't be needed for those who keep GSB (data are shared and actually most are blocked by it). An advantage of this list is page-level blocking in contrast to domain-level, which is important as bad guys have been abusing trusted domains (https://www.helpnetsecurity.com/2019/03/01/malicious-urls-good-domains/). The problem is these rules are written in forms like cdn.truelife.vn/webtube/201310/2139273/pianito.exe which don't work with uBO. If we are to adopt the list, we have to ask @MDLeom to convert all the rules to ABP syntax. Phishing Army includes some obvious FPs and again most are covered by GSB. One of its source is OpenPhish https://openphish.com/feed.txt which may be less FP-prone and also allows page-level blocking but I think their format is not performance-friendly - no prefix "|" and very long URLs mean they have to be checked against every requests, right?

gorhill commented 4 years ago

cdn.truelife.vn/webtube/201310/2139273/pianito.exe which don't work with uBO

That does work with uBO; the absence of the || prefix just mean the pattern can match at any point in the URL, i.e. https://example.org/?u=cdn.truelife.vn/webtube/201310/2139273/pianito.exe would also match.

Yuki2718 commented 4 years ago

cdn.truelife.vn/webtube/201310/2139273/pianito.exe which don't work with uBO

That does work with uBO; the absence of the || prefix just mean the pattern can match at any point in the URL, i.e. https://example.org/?u=cdn.truelife.vn/webtube/201310/2139273/pianito.exe would also match.

In this case, the problem is about strict blocking - you can download the exe if the rule was cdn.truelife.vn/webtube/201310/2139273/pianito.exe or ||cdn.truelife.vn/webtube/201310/2139273/pianito.exe, so it must be ||cdn.truelife.vn/webtube/201310/2139273/pianito.exe$document or ||cdn.truelife.vn/webtube/201310/2139273/pianito.exe$all

kulfoon commented 4 years ago

As per requirement nr 1 from https://github.com/uBlockOrigin/uBlock-issues/issues/984#issuecomment-612605380 : "between 20,000 and 50,000 entries," as so far none of lists proposed in the current thread meets this criteria:

~10000 https://phishing.army/download/phishing_army_blocklist.txt
 ~2700 https://gitlab.com/curben/urlhaus-filter/raw/master/urlhaus-filter-online.txt
 ~2700 https://hexxiumcreations.github.io/threat-list/hexxiumthreatlist.txt
 ~2600 http://www.dshield.org/feeds/suspiciousdomains_Medium.txt
 ~1500 https://raw.githubusercontent.com/stamparm/maltrail/master/trails/static/suspicious/pua.txt
  ~100 https://osint.digitalside.it/Threat-Intel/lists/latestdomains.txt

Nor the one I've just found (just a quick search, not a deep investigation nor analysis yet): https://github.com/mitchellkrogza/The-Big-List-of-Hacked-Malware-Web-Sites, but it might be worth checking coz it's the biggest one as far from lists proposed in the current thread:

~14000 https://raw.githubusercontent.com/mitchellkrogza/The-Big-List-of-Hacked-Malware-Web-Sites/master/hosts
 ~9000 https://raw.githubusercontent.com/mitchellkrogza/The-Big-List-of-Hacked-Malware-Web-Sites/master/hacked-domains.list

I know that big (even giant ones) malware hosts lists do exist as well but most of them don't qualify because they are kind of "A big merged/ultimate collection of hosts from reputable sources." and they contain mixed some non-malware related websites like pornsites, social, gambling etc. and hell knows what else and they contain many false positives, examples: https://github.com/StevenBlack/hosts https://github.com/mitchellkrogza/Ultimate.Hosts.Blacklist https://github.com/AdroitAdorKhan/EnergizedProtection

But as I said, I haven't done a deep search so far, perhaps such lists with 20000 - 50000 malware related entries do exist somewhere (like the one which is currently in uBO: "Malware domains" with ~27000 domains).

gorhill commented 4 years ago

between 20,000 and 50,000 entries," as so far none of lists proposed in the current thread meets this criteria

Looking again, I realize MDL is around 1,100 entries -- I had in mind we were talking about the other malware list, which is ~26K entries. So mainly what I am saying is to replace MDL by a list with roughly similar size.

KonoromiHimaries commented 4 years ago

@gorhill i'am suggest replacing with multi list, that just include MDL. like, create new repo and add these list into script https://github.com/KonoromiHimaries/PolishSubFilters/tree/master/scripts

Yuki2718 commented 4 years ago

@KonoromiHimaries Why include MDL, it should rather be excluded. We need to find at least one high-quality list. As noted, all the lists (except osint, too short; didn't test) mentioned here included some false positives and thus combining them will increase the rate of FPs. And 99% of entries in these lists will never be hit for each user even without Google Safe Browsing which covers most of those lists.

KonoromiHimaries commented 4 years ago

here included some false positives and thus combining them will increase the rate of FPs

only stable lists, but any unstable filters can be added manual. like https://github.com/PolishFiltersTeam/KAD/issues/1297

jawz101 commented 4 years ago

malware domain lists are useless, IMO.