msabramo / requests-unixsocket

Use requests to talk HTTP via a UNIX domain socket
Apache License 2.0
207 stars 29 forks source link

Error if the path to the socket is greater than 70ish characters #23

Closed tjj5036 closed 7 years ago

tjj5036 commented 8 years ago

I ran into an issue where I had a path to a socket that looked like this:

 %2Fstorage%2Ftestroot%2F123456789%2F123456789111%2Fdata%2Flocal_auth.sock

Which is 73 characters long. When prefixed with http+unix, the full path looks like:

 http+unix://%2Fstorage%2Ftestroot%2F123456789%2F123456789111%2Fdata%2Flocal_auth.sock

When passed to requests, the following error occurs:

File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests_unixsocket/init.py", line 60, in post File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests_unixsocket/init.py", line 46, in request File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests/sessions.py", line 461, in request File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests/sessions.py", line 394, in prepare_request File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests/models.py", line 295, in prepare File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests/models.py", line 364, in prepare_url requests.exceptions.InvalidURL: URL has an invalid label.

Which is in turn triggered by this:

Traceback (most recent call last): File "/usr/local/updated-openssl/lib/python3.4/site-packages/requests/models.py", line 362, in prepare_url UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)

Digging around in the source for models.py, it looks like it extracts the host (which is the path to the socket above), and then performs:

host = host.encode('idna').decode('utf-8')

This will fail in accordance with the IDNA RFC, which states:

The conversions between ASCII and non-ASCII forms of a domain name are accomplished by algorithms called ToASCII and ToUnicode. These algorithms are not applied to the domain name as a whole, but rather to individual labels. For example, if the domain name is www.example.com, then the labels are www, example, and com. ToASCII or ToUnicode are applied to each of these three separately.

The details of these two algorithms are complex, and are specified in RFC 3490. The following gives an overview of their function.

ToASCII leaves unchanged any ASCII label, but will fail if the label is unsuitable for the Domain Name System. If given a label containing at least one non-ASCII character, ToASCII will apply the Nameprep algorithm, which converts the label to lowercase and performs other normalization, and will then translate the result to ASCII using Punycode[16] before prepending the four-character string "xn--".[17] This four-character string is called the ASCII Compatible Encoding (ACE) prefix, and is used to distinguish Punycode encoded labels from ordinary ASCII labels. The ToASCII algorithm can fail in several ways; for example, the final string could exceed the 63-character limit of a DNS name. A label for which ToASCII fails cannot be used in an internationalized domain name.

My interpretation of the error is that if the path to the socket is rather long, this fails. Is there a suggested workaround other than symlinking the socket file to something short?

rockstar commented 8 years ago

I have this error in pylxd, when the default path is %2Fvar%2Flib%2Flxd%2Funix.socket - I don't think length has anything to do with the issue. In fact, something requests 2.12 is what is causing the issue. As long as I use a requests <2.12, I don't have the issue at all.

msabramo commented 7 years ago

There were problems with requests relating to their IDNA encoding that were since fixed (see https://github.com/msabramo/httpie-unixsocket/issues/10#issuecomment-280892450). You might want to try this again with the latest version of requests. I don't think there's anything that I can do in this project to help with this issue.