medialize / URI.js

Javascript URL mutation library
http://medialize.github.io/URI.js/
MIT License
6.26k stars 474 forks source link

Incorrect conversion of IDNA hostname #157

Open hsalkaline opened 10 years ago

hsalkaline commented 10 years ago

URI.js punicode support works incorrectly.

For example: URI('http://www.Äffchen.com/').normalizeHostname().hostname() == "www.xn--ffchen-vna.com"

The correct conversion (as in browser) is following: var a = document.createElement('a'); a.href = 'http://www.Äffchen.com/'; a.hostname == "www.xn--ffchen-9ta.com"

ooxi commented 10 years ago

Why would the correct conversion be www.xn--ffchen-9ta.com? phlyLabs Punycode also outputs www.xn--ffchen-vna.com. I don't think the lower case conversion is necessary.

rodneyrehm commented 10 years ago

I wonder what @mathiasbynens thinks about this

hsalkaline commented 10 years ago

After some search i found, that (please, correct me, if i misunderstood smth):

Does punicode.js allow to choose the way the domain would be converted? And if it allow, shouldn't URI.js support this in API?

mathiasbynens commented 10 years ago

See https://github.com/mathiasbynens/todo/issues/9. This is not something that belongs in Punycode.js as it’s not part of Punycode. It’s part of the preprocessing that happens before the domain name is Punycoded.

As per @annevk’s http://annevankesteren.nl/2014/06/url-unicode, http://unicode.org/reports/tr46/ should be used. It’s compatible with IDNA2003, but uses IDNA2008 data.

annevk commented 10 years ago

Note that TR46 should be used with the settings noted in http://url.spec.whatwg.org/ We want a particular flavor of TR46.