Open porterjamesj opened 10 years ago
The browser does the appropriate mangling of the URI which we probably would have to implement.
For now we could just add a URI constructor that converts String
inputs to ASCIIString
? Won't actually handle unicode but it will at least allow one to pass things that are typed UTF8 but will conform to the ASCII character set to Requests.get, etc.
That sounds reasonable.
I'll make a PR.
I think the hostname
part needs to be encoded in punycode (http://www.faqs.org/rfcs/rfc3492.html) and the path
percent-escaped UTF8.
punycode is fairly complex - see an example implementation. Is anyone up for having a go at it?
Roughly two years old at this point...is this still a desirable feature/anyone going to claim this?
another ~2 years later...
The URI part of this seems to currently not-fail in HTTP.jl:
julia> x = HTTP.URI("http://☃.net")
HTTP.URI("http://☃.net")
julia> HTTP.URIs.showparts(x)
HTTP.URI("http://☃.net"
scheme = "http",
userinfo = "" (absent),
host = "☃.net",
port = "" (absent),
path = "",
query = "" (absent),
fragment = "" (absent))
getaddrinfo
still fails
julia> HTTP.get("http://☃.net")
ERROR: non-ASCII hostname: ☃.net
Stacktrace:
[1] getaddrinfo(::Function, ::String) at ./socket.jl:619
And there is this: https://github.com/apricis/Punycoder.jl/blob/master/punycoder.jl
I'm not sure about the details, but it might be nice to allow URIs of non-ASCII text (i.e. use
String
rather thanASCIIString
in the type definition). The RFC is a bit vague on this. Perhaps it isn't technically allowed but it seems possible to encounter in the wild (e.g. ☃.net will resolve in a browser). On a practical level it's somewhat annoying to not be able to passUTF8String
s orSubString
s thereof to methods in Requests.jl.