Open spectrumjade opened 8 years ago
This problem actually comes out of httplib, and it represents a very real limitation of the httplib ecosystem (of which requests is a part).
When creating a tunnel, httplib calls into its _tunnel()
method when we attempt to connect. The problem here, as you can see in that code, is that the 407 response never makes it out of httplib: it doesn't even try to parse the headers. That means we can't easily find the 407 challenge header.
Changing this behaviour is possible, but requires urllib3 doing yet more to work around httplib, which I'm increasingly uncomfortable with. Already it's extremely difficult to replace httplib inside urllib3 because urllib3 already knows a great deal about httplib's internal implementation details: I'm highly reluctant to add more.
Thanks for that explanation @Lukasa
I'm currently facing the same situation and also found about the httplib issue, but couldn't even think of any workaround. Are there even any python HTTP/S libraries that properly support multiple proxy authentication methods?
@sylencecc httplib2 might, by virtue of not being built on top of httplib. Otherwise, proxies are pretty poorly supported I'm afraid. =(
I know that curl supports this authentication and therefore pycurl probably does as well. Unfortunately pycurl isn't as elegant.
Now urllib3 v2 is coming, which is going to drop httplib, so let's make it possible!
With my 3 patches:
the following test case runs fine:
import requests
from requests_toolbelt.auth.http_proxy_digest import HTTPProxyDigestAuth
proxy = 'http://ip:port'
proxies = {
"http": proxy,
"https": proxy,
}
def req(url):
auth = HTTPProxyDigestAuth("user", "pass")
r = requests.get(url, proxies=proxies, auth=auth)
print(r.json())
req("http://httpbin.org/ip")
req("https://httpbin.org/ip")
This problem actually comes out of httplib, and it represents a very real limitation of the httplib ecosystem (of which requests is a part).
When creating a tunnel, httplib calls into its
_tunnel()
method when we attempt to connect. The problem here, as you can see in that code, is that the 407 response never makes it out of httplib: it doesn't even try to parse the headers. That means we can't easily find the 407 challenge header.Changing this behaviour is possible, but requires urllib3 doing yet more to work around httplib, which I'm increasingly uncomfortable with. Already it's extremely difficult to replace httplib inside urllib3 because urllib3 already knows a great deal about httplib's internal implementation details: I'm highly reluctant to add more.
This is a very old but still relevant issue.
http.client
in Python3.12+ has made certain changes which preserves headers information even in case of connection failing. _tunnel()
still raises OSError but headers can be accessed via self.get_proxy_response_headers()
if the OSError can be caught in calling function (urllib3.connection.HTTPSClient
). I think the changes to be made to urllib3/requests are now much smaller.
Proxy Digest authentication seems to work fine with unencrypted http, but https requests (which are made using a CONNECT tunnel through the proxy) fail with an exception.
How to reproduce (I'm using requests 2.6.0 and requests-toolbelt 0.6.0):
The 407 response to the CONNECT request should be hooked in the same fashion as unencrypted requests.
What's curious is that the documentation includes an example with an https site.