Open manosamy opened 7 years ago
As discussed in the Implementer's guide (http://docs.ens.domains/en/latest/implementers.html#normalising-and-validating-names), it's necessary for resolvers to normalise names properly according to the rules; if they do so, names like this will never be resolved to, and so are harmless. This is a bug with enslisting failing to follow the rules correctly.
I guess, I will have to alert MEW, etherscan, metamask and uport as well then, dont think anyone is checking for this. Will alert each of those in their own forums, thanks.
MetaMask uses a module we made, namehash, specifically for following the recommended guide.
@manosamy Apologies if I was abrupt. I'm reopening to triage support on individual clients.
@danfinlay , @Arachnid, is there a solidity library you are aware of? I see more and more smartcontracts like namebazaar accepting name strings, would be nice if there is a solidity library for UTS46 verification. This poses an attack vector where someone lists a seemingly correct name up for sale in namebazaar, they dont need to go via a client in that case, just talk directly to name bazaar offering smart contract, and sell an imposter name (buyers remorse at the end when they try to use it, but it would be too late)
A spoofing attack is also possible with hopmoglyphs. For an example, ⅿіϲrοѕοft.eth
and it is still available! Used https://atom.io/packages/mayhem to manipulate the string.
FYI, MyEtherWallet uses the correct normalization. I would propose that a link to this thread be added to the ENS docs under normalization. For people like me (less technical), seeing the real-world examples of the things libraries prevent is very helpful when putting together our internal specs and testing, and obviously another warning or more information never hurt a dev (I don't think... 😉)
Appreciate you looking out for the community and opening this issue. Thank you.
How are people running normalization in browser apps? The fact that idna-ets46 doesn't transpile to ES5 is causing headaches on this from: https://github.com/danfinlay/eth-ens-namehash/issues/5
Another interesting variation spotted in enslisting: microsoft.eth
The 'o' is not really a normal o, it is "Cyrillic Small Letter O", a valid character under UTS46 rules, this listing was created by directly operating with the enslisting smartcontract, and both MEW and Etherscan show this as a valid name (correctly so, since this is a valid character under UTS46 rules), yet, this is a perfect phishing target, completely undetectable.
Unicode : utf-8 char 0xD0BE / 043E https://vazor.com/unicode/c043E.html Name: CYRILLIC SMALL LETTER NARROW O http://www.unicode.org/Public/idna/10.0.0/IdnaMappingTable.txt (lookup under 043E)
@manosamy Right. This is also called out in the relevant implementers' guide page, and should be handled by clients checking the alphabets used in a given name and alerting or blocking outright if they don't fall into a whitelist. Unfortunately I'm not aware of any libraries for this - I'll add that as high priority todo for the ENS org once it exists and we have resources.
We had a great talk on this topic at the ENS Summit by @mcdee, and he has an API for doing analysis on this kind of character attack. We definitely do need more libraries and solutions for mitigating that kind of attack.
https://en.wikipedia.org/wiki/IDN_homograph_attack#Defending_against_the_attack
https://www.xudongz.com/blog/2017/idn-phishing/
the confusables list: http://www.unicode.org/Public/security/latest/confusables.txt
Here is how Google does it: https://www.chromium.org/developers/design-documents/idn-in-google-chrome
(Since I know you didnt click that link, here is a screenshot:)
If the end of a hostname is identical to one of top 10k domains after removing diacritic marks and mapping each character to its spoofing skeleton (e.g. www.googlé.com with 'é' in place of 'e'), punycode is shown.
Alright guys. We just need to get MEW to top 10k domains and phishers will be fucked. ;)
Someone recommended we add blockies or something similar for ENS names. Not sure if it would work, but would be interesting to see. Could be simple. Not that most people who get caught by phishing would notice the addressicon.
Another idea I had was to color code and warn for the different unicode blocks (http://jrgraphix.net/research/unicode.php) similar to how lastpass does it:
Or decode all to punycode and display it always.
Bear in mind that unlike DNS, ENS doesn't use Punycode anywhere. I like the idea of coloring better - though it should also just put up a big red warning if alphabets are mixed.
I think my favorites are...
Red warnings for
Orange or Blue "Warnings" for
Color code it regardless
Adjust as new attacks happen
I think it would be perfectly acceptable for most clients to simply disallow mixed alpha names, or confusing or invisible characters. Just because ENS had to allow it by necessity (namehash), doesn't mean clients have to allow malicious behavior.
It's just a matter of getting that api or library of characters to their plausible alphabets, and I recall Jim suggesting this was a difficult task to make work client-side, so it might require some API involvement. Maybe we could even build merkle-proving into the validation, to prevent a library provider from lying. Keep a hash of a recent valid library on chain, for example.
The main issue with sending this client-side is the additional size of the code, as it's built using ICU. There might be something broadly equivalent in javascript that would allow for a realistic client deployment but I haven't looked at it.
By the way if anyone wants to poke at it there is a test API up at http://www.mcdee.net:8080/name/ - just add the name of the domain you want to check (e.g. http://www.mcdee.net:8080/name/blаckhаt ) to see it in action.
@tayvano , how about feel about double entry verification for ens names when someone gets to the transaction/contracts page using a hyperlink? (like how you have to enter bank account numbers twice in some websites with the second box not allowing paste)? User is forced to type in the name, and if it doesnt match, we stop? Any other params in the url will still pass through, so, not as bad, we just want to validate they hit the right contract? If they are entering it manually, may be don't make them enter twice unless the value transferred is beyond a threshold (to prevent fat fingering a look-alike name)? This serves the same purpose of why my mortgage company makes me enter bank account number twice before I schedule my payments.
@mcdee , can you also send a character-by-character report like this one (https://vazor.com/unicode/c043E.html), that would make it immensely useful?
Forked @danfinlay 's eth-ens-namehash for now and made it to use idna-uts46-hx also changed one more line to downgrade from ES6 to ES5, published to npm as eth-ens-namehash-ms So that I can uglify without errors. Also implemented "foreign" name highlighting based on @madvas feedback, and double entry for suspicious names.
What's the advantage of the idna-uts46-hx
library? I'd accept a PR if it were submitted with a good reason. I'd easily have merged the ES5 one.
idna-uts46 uses ES6 syntax (atleast in one place if not more), and so does eth-ens-namehash (in one line). I use Angular 4, and when doing a prod build, it uses webpack/uglify for tree shaking and minifying the output, but unfortunately, it doesn't like ES6. Its too hard to fix that config to make it understand ES6, so took a shortcut and downgraded the other libraries. I guess @Arachnid faced the same issue, seems like a known issue with uglify
To alleviate these concerns, I've now added a build step to eth-ens-namehash
so it can resume using advanced language features, while still exporting an ES5 module: https://github.com/danfinlay/eth-ens-namehash/pull/7
eth-ens-namehash@2.0.1
@manosamy regarding the per-character information: the link you gave appears to be just a dump of the same character information in different formats. What is it that you would want to do with the info, perhaps there could be some upstream processing to provide you with more useful info?
One of the users of my site brought to my attention the name amazon%E2%80%8B.eth. It looks and feels like a genuine 6 character ens name, however it has a zero width space char as the 7th one. This easily passes the smell test of etherscanio, myetherwallet (Goto https://enslisting.com, search for amazon, and copy the matching name (it will appear to have only 6 chars but has a non printable 7th char). Paste it in etherscan.io/enslookup, you will see that it is a registered name. It would clear "InvalidateName" check ((strlen(unhashedName) > 6) throw). If you paste in MEW, and then delete the .eth, it will accept it as a valid name, but then suggest that the name is not taken yet, I guess it strips that char somewhere along the way)
While I appreciate the out of the box thinking of the user who registered this, it poses a serious threat. Imagine coindash actually did post an ens name for their ICO, and it was hacked and replaced with "coindash.eth" on their website, anyone copy pasting that name, or clicking that link has no way of knowing even if they were visually alert. All of today's client WALLETs would vouch for that name.
At the very minimum, all clients should validate for this char and consider this as an invalid name. Potentially in the next iteration of registrar, these chars should be disallowed.
PS: If you didnt notice, copy paste the coindash name in quotes into the text editor of your choice and inspect, it is an infected name with ZWS char at the end, you cant just tell by looking