Zero Width Space Chars in names - Spoofing risk

manosamy commented 7 years ago

One of the users of my site brought to my attention the name amazon%E2%80%8B.eth. It looks and feels like a genuine 6 character ens name, however it has a zero width space char as the 7th one. This easily passes the smell test of etherscanio, myetherwallet (Goto https://enslisting.com, search for amazon, and copy the matching name (it will appear to have only 6 chars but has a non printable 7th char). Paste it in etherscan.io/enslookup, you will see that it is a registered name. It would clear "InvalidateName" check ((strlen(unhashedName) > 6) throw). If you paste in MEW, and then delete the .eth, it will accept it as a valid name, but then suggest that the name is not taken yet, I guess it strips that char somewhere along the way)

While I appreciate the out of the box thinking of the user who registered this, it poses a serious threat. Imagine coindash actually did post an ens name for their ICO, and it was hacked and replaced with "coindash.eth" on their website, anyone copy pasting that name, or clicking that link has no way of knowing even if they were visually alert. All of today's client WALLETs would vouch for that name.

At the very minimum, all clients should validate for this char and consider this as an invalid name. Potentially in the next iteration of registrar, these chars should be disallowed.

PS: If you didnt notice, copy paste the coindash name in quotes into the text editor of your choice and inspect, it is an infected name with ZWS char at the end, you cant just tell by looking

Arachnid commented 7 years ago

As discussed in the Implementer's guide (http://docs.ens.domains/en/latest/implementers.html#normalising-and-validating-names), it's necessary for resolvers to normalise names properly according to the rules; if they do so, names like this will never be resolved to, and so are harmless. This is a bug with enslisting failing to follow the rules correctly.

manosamy commented 7 years ago

I guess, I will have to alert MEW, etherscan, metamask and uport as well then, dont think anyone is checking for this. Will alert each of those in their own forums, thanks.

danfinlay commented 7 years ago

MetaMask uses a module we made, namehash, specifically for following the recommended guide.

Arachnid commented 7 years ago

@manosamy Apologies if I was abrupt. I'm reopening to triage support on individual clients.

manosamy commented 7 years ago

@danfinlay , @Arachnid, is there a solidity library you are aware of? I see more and more smartcontracts like namebazaar accepting name strings, would be nice if there is a solidity library for UTS46 verification. This poses an attack vector where someone lists a seemingly correct name up for sale in namebazaar, they dont need to go via a client in that case, just talk directly to name bazaar offering smart contract, and sell an imposter name (buyers remorse at the end when they try to use it, but it would be too late)

harshjv commented 7 years ago

A spoofing attack is also possible with hopmoglyphs. For an example, ⅿіϲrοѕοft.eth and it is still available! Used https://atom.io/packages/mayhem to manipulate the string.

tayvano commented 7 years ago

FYI, MyEtherWallet uses the correct normalization. I would propose that a link to this thread be added to the ENS docs under normalization. For people like me (less technical), seeing the real-world examples of the things libraries prevent is very helpful when putting together our internal specs and testing, and obviously another warning or more information never hurt a dev (I don't think... 😉)

Appreciate you looking out for the community and opening this issue. Thank you.

pointtoken commented 7 years ago

How are people running normalization in browser apps? The fact that idna-ets46 doesn't transpile to ES5 is causing headaches on this from: https://github.com/danfinlay/eth-ens-namehash/issues/5

manosamy commented 7 years ago

Another interesting variation spotted in enslisting: microsoft.eth

The 'o' is not really a normal o, it is "Cyrillic Small Letter O", a valid character under UTS46 rules, this listing was created by directly operating with the enslisting smartcontract, and both MEW and Etherscan show this as a valid name (correctly so, since this is a valid character under UTS46 rules), yet, this is a perfect phishing target, completely undetectable.

Unicode : utf-8 char 0xD0BE / 043E https://vazor.com/unicode/c043E.html Name: CYRILLIC SMALL LETTER NARROW O http://www.unicode.org/Public/idna/10.0.0/IdnaMappingTable.txt (lookup under 043E)

Arachnid commented 7 years ago

@manosamy Right. This is also called out in the relevant implementers' guide page, and should be handled by clients checking the alphabets used in a given name and alerting or blocking outright if they don't fall into a whitelist. Unfortunately I'm not aware of any libraries for this - I'll add that as high priority todo for the ENS org once it exists and we have resources.

danfinlay commented 7 years ago

We had a great talk on this topic at the ENS Summit by @mcdee, and he has an API for doing analysis on this kind of character attack. We definitely do need more libraries and solutions for mitigating that kind of attack.

tayvano commented 7 years ago

possible solutions for ui / ux

Someone recommended we add blockies or something similar for ENS names. Not sure if it would work, but would be interesting to see. Could be simple. Not that most people who get caught by phishing would notice the addressicon.

Another idea I had was to color code and warn for the different unicode blocks (http://jrgraphix.net/research/unicode.php) similar to how lastpass does it: photo on 11-11-17 at 3 00 pm

Or decode all to punycode and display it always.

Arachnid commented 7 years ago

Bear in mind that unlike DNS, ENS doesn't use Punycode anywhere. I like the idea of coloring better - though it should also just put up a big red warning if alphabets are mixed.

tayvano commented 7 years ago

I think my favorites are...

Red warnings for

mixed alpha
certain confusing or invisble characters
dangerous patterns

Orange or Blue "Warnings" for

Unicode characters outside a certain blocks (educational over scary)
Unicodes with the little do-das over them (diacritic marks)

Color code it regardless

Adjust as new attacks happen

danfinlay commented 7 years ago

I think it would be perfectly acceptable for most clients to simply disallow mixed alpha names, or confusing or invisible characters. Just because ENS had to allow it by necessity (namehash), doesn't mean clients have to allow malicious behavior.

It's just a matter of getting that api or library of characters to their plausible alphabets, and I recall Jim suggesting this was a difficult task to make work client-side, so it might require some API involvement. Maybe we could even build merkle-proving into the validation, to prevent a library provider from lying. Keep a hash of a recent valid library on chain, for example.

mcdee commented 7 years ago

The main issue with sending this client-side is the additional size of the code, as it's built using ICU. There might be something broadly equivalent in javascript that would allow for a realistic client deployment but I haven't looked at it.

By the way if anyone wants to poke at it there is a test API up at http://www.mcdee.net:8080/name/ - just add the name of the domain you want to check (e.g. http://www.mcdee.net:8080/name/blаckhаt ) to see it in action.

manosamy commented 7 years ago

@tayvano , how about feel about double entry verification for ens names when someone gets to the transaction/contracts page using a hyperlink? (like how you have to enter bank account numbers twice in some websites with the second box not allowing paste)? User is forced to type in the name, and if it doesnt match, we stop? Any other params in the url will still pass through, so, not as bad, we just want to validate they hit the right contract? If they are entering it manually, may be don't make them enter twice unless the value transferred is beyond a threshold (to prevent fat fingering a look-alike name)? This serves the same purpose of why my mortgage company makes me enter bank account number twice before I schedule my payments.

@mcdee , can you also send a character-by-character report like this one (https://vazor.com/unicode/c043E.html), that would make it immensely useful?

manosamy commented 7 years ago

Forked @danfinlay 's eth-ens-namehash for now and made it to use idna-uts46-hx also changed one more line to downgrade from ES6 to ES5, published to npm as eth-ens-namehash-ms So that I can uglify without errors. Also implemented "foreign" name highlighting based on @madvas feedback, and double entry for suspicious names.

danfinlay commented 7 years ago

What's the advantage of the idna-uts46-hx library? I'd accept a PR if it were submitted with a good reason. I'd easily have merged the ES5 one.

manosamy commented 7 years ago

idna-uts46 uses ES6 syntax (atleast in one place if not more), and so does eth-ens-namehash (in one line). I use Angular 4, and when doing a prod build, it uses webpack/uglify for tree shaking and minifying the output, but unfortunately, it doesn't like ES6. Its too hard to fix that config to make it understand ES6, so took a shortcut and downgraded the other libraries. I guess @Arachnid faced the same issue, seems like a known issue with uglify

danfinlay commented 7 years ago

To alleviate these concerns, I've now added a build step to eth-ens-namehash so it can resume using advanced language features, while still exporting an ES5 module: https://github.com/danfinlay/eth-ens-namehash/pull/7

eth-ens-namehash@2.0.1

mcdee commented 7 years ago

@manosamy regarding the per-character information: the link you gave appears to be just a dump of the same character information in different formats. What is it that you would want to do with the info, perhaps there could be some upstream processing to provide you with more useful info?

matthewlilley commented 6 years ago

https://github.com/MyCryptoHQ/ens-validation/pull/2

ensdomains / ens

Zero Width Space Chars in names - Spoofing risk #240

Further reading from the real world:

possible solutions for ui / ux