✨🧠 Ideas on how to Improve Detection of Malicious Sites

tayvano commented 2 years ago

I really think we need to improve how our fuzzylist works. Levenshtein comparison just isn't enough. Seems like we could iterate on these ideas and other similar ones to block more malicious sites than we do today and get less false positives.

We could have recognized that the website name was very similar to a popular site (1inch).
Warn for extremely close levenshtien comparisons (i.e. distance of 1).
Explain the situation better on the warning screen, including our exact reasons for flagging the domain. We can have a harsher warning for a known malicious site, and for fuzzylist matches we can say "this looks similar to ___" and "it's similar for these reasons:" instead of just vaguely casting doubt.
Maybe we should add 1inch to our fuzzy list. We've had conversations in the past about adding 1inch to our fuzzylist, and decided not to because of the false positive problem.

Tay: One interesting thing that Nadav brought up was around trying to flag malicious websites based on the content or requests of the website, rather than just the URL. This was specifically with regards to the NFT minting sites that then request approve() on tokens or setApprovalForAll() on NFTs. I was thinking it may prove valuable for the pesky cloud mining scam as well.

the idea is, given a list of URLs, is it possible to detect ones that follow the phishing patterns of the moment and preemptively blocklist them.

Problem statement: OpenSea is beginning to investigate methods of programmatically detecting whether an arbitrary URL is likely to be an approval phishing scam.The rough heuristic we’re pursuing is “programmatic baiting”: spinning up a headless browser w/ a MetaMask wallet w/ valuable NFTs in tow, and then visiting the site, triggering buttons matching certain keywords (e.g. “Mint”), and inspecting whether the transactions triggered are setApproveAlls to unknown contracts or other suspicious requests according to a given ruleset.The intended output would be a system by which we can ingest a corpus of arbitrary URLs in realtime, evaluate them for tell-tale signs of scams programmatically, and push them to Phishfort for flagging if we are confident in their fraudulence

I was thinking though that instead of doing this loop around where we grab URLs adn then test the URLs and then add the URLs to the blocklist, it would be far more interesting to run a set of checks when website first attempts to connect to MetaMask and throw flags on-the-fly if certain conditions are met. Obviously we would need to do a lot of testing to determine what these tests would be and ensure we had reasonable confidence in the flag-rate but I figured i would at least start the conversation to see if something valuable could come from this sooner rather than later.

Harry: We can do IOC/fingerprint detection for sure. Something along the lines of this https://twitter.com/sniko_/status/1151588248407478272

--

Better Fuzzy Shit

Highlighting confusable characters seems like an effective way to get users to more closely scrutinize domains
We can caution users when we see confusable characters, and have stronger warnings when non-visible characters and other such sneaky things are detected.
Have a stronger warning for alternate-TLD domains that are on our fuzzylist.
Warn against domains that include the fuzzylist domain, or some confusable version of it. e.g. for the metamask.io fuzzylist entry, we'd warn on [prefix]-metamask or metamask-[suffix], or even [prefix]-rnetamask (note the R-N)

https://github.com/elceef/dnstwist has gotten significantly better since I last used it in 2017 look its online now too daaaamn https://dnstwist.it/

dnstwist also uses https://ssdeep-project.github.io/ssdeep/index.html for fuzzy matching hashes of sites

their url fuzzing is here https://github.com/elceef/dnstwist/blob/master/dnstwist.py#L350

aamirmursleen commented 2 years ago

It is a great idea. My domain dignity.pk, which operates an eCommerce store in Pakistan, has been blacklisted since June 2022.

audius-automation commented 2 years ago

Agree @tayvano.

autius.co has been automatically blocked by fuzzy search without even using Metamask or web3 at all. Just a simple tool to automate your activity using selenium.

Something similar to a honeypot or heuristic analysis would be much smarter and yield less false positives imo.

Worst case a prompt that the user can decline, like the chrome no http certificate one. Right now the user can't access any website without disabling the check in Metamask's settings, which makes them prone to more problems because users tend to leave it that way afterwards and forget about it.

MetaMask / eth-phishing-detect

✨🧠 Ideas on how to Improve Detection of Malicious Sites #8245

Better Fuzzy Shit