ethjs / ethjs-ens

An Ethereum Name Service interface module built on EthJS
33 stars 24 forks source link

IDN Name Handling #14

Open 0xc1c4da opened 6 years ago

0xc1c4da commented 6 years ago

Existing browsers have sets of conditionals for performing name lookups and partially mitigating homograph attacks. In their case, DNS is ascii based so they can resort a punycode.

Since ENS is Unicode-based, it's unclear how clients like Metamask, MEW or Status should handle this problem.

https://www.chromium.org/developers/design-documents/idn-in-google-chrome https://wiki.mozilla.org/IDN_Display_Algorithm http://www.unicode.org/reports/tr39/#Restriction_Level_Detection

danfinlay commented 6 years ago

I heard an interesting idea at the ENS workshop yesterday, where different character sets could be displayed in different colors. That might be ugly, but could be a nice security measure. Of course, since it's about display, that idea would be a UI feature, and wouldn't really have a place in this module.

mcdee commented 6 years ago

I believe that there are a few steps that we could take.

The first is to restrict names to characters that come from a single language, using block identifiers as per https://en.wikipedia.org/wiki/Unicode_block This avoids the issue where someone attempts to spoof a name using a single homograph in an otherwise-latin name (e.g. using аcompany with a cyrillic 'а' rather than the latin acompany). The workable rules that have been found are a bit more complex, and found in the IDN display algorithm link in the OP.

The second is to use some sort of reduction-to-latin. In this step, names such as 'ѕрасе.eth' would be reduced to their latin 'space.eth'. At this point there can be a check of the latter name to see if it resolves to an address (and specifically a different address to the non-latin version) and if so flag as suspicious.

The issue here is that the restrictions are tighter than those for registering an .eth domain so it will cause some registered names to be unresolvable. Of course, that's kind-of the point of doing this but we have to consider that there might be some valid names that end up being unresolvable. I would suggest that we can start off handling this with PRs to whatever specification we put together, either to upgrade the algorithm or to add to a whitelist.

It's also worth pointing out that these are a subset of an idea that I'm working on that should provide more validation points and make provide a clearer indication as to if a name, and its resolved address, should be considered valid. I'll try to get some more details of this out in the next week.