Open asmeurer opened 10 years ago
This was briefly discussed in IRC, as seen here. This GitHub issue encompasses a number of problems. Let's enumerate them.
HTTPProxyAuth
doesn't work for tunneling proxies. Yes, that's true, and that's because tunneling proxies are fundamentally a very different beast to your standard proxy, involving the CONNECT request and all kinds of funky nonsense. Our auth paradigm doesn't map to this special case.
My proposal here is actually to get rid of HTTPProxyAuth
altogether. I hate it. We have had auth-in-the-proxy-url for a long time now, so it's not like it's the only way to do it. Additionally, it provides better conceptual abstraction: all auth handlers now apply auth to the origin, not to intermediaries. I like that distinction.
ConnectionError
on the method call rather than getting the 407 back. This is again because of the way httplib
is designed for tunneling proxies. I think we can improve this situation by getting urllib3
to throw an appropriate exception which we catch and wrap. We can't move the exception to .raise_for_status()
because that would require a httplib
rewrite (or something truly evil), but at least we can make it easier to work out what happened.@shazow, are you open to doing part 2?
Well, in this case, the response
attribute of the ConnectionError
is None
. If it were the same as with HTTPError, I could just handle them uniformly.
I don't think it can be, logistically: we don't have a response in hand. This is all handled transparently by httplib
. Because it's terrible.
Another possibly unrelated issue (I can open a new issue). If you mutate the proxies
attribute of a Session
object, the new value isn't used. So if you want to redo the request with the fixed proxies ("fixed" meaning including authentication credentials), you have to do
def dostuff(url, session=None):
session = session or Session() # Actually a custom Session subclass
...
try:
resp = session.get(url, proxies=session.proxies) # I shouldn't have to include the proxies in get() here; they are already in session.proxies!
resp.raise_for_status
except HTTPError as e:
if e.response.status_code == 407:
# Get username and password and put it on the proxy url
session.proxies['http'] = fixes_proxy_url
# Try again
dostuff(url, session=session)
And more to the point, that's a lot of logic to repeat every time I do a get
(it's actually more, because I also have to check for ConnectionError
). It would be nice if requests just had a way to automatically prompt the user for proxy authentication and retry the request.
But I'll be happy to just get these basic issues ironed out first.
Uh...as far as I can see on a quick code read, we absolutely pick up the proxy settings from the Session. What makes you think we're not?
As for 'prompting' for proxy auth, that's not going to happen, it's simply not requests' job.
Uh...as far as I can see on a quick code read, we absolutely pick up the proxy settings from the Session. What makes you think we're not?
My guess would be that once we have made a request with a proxy, e.g., we've used the http
proxy, then if @asmeurer changes the list of proxies it isn't fixed because we're using the same HTTPAdapter
which hits lines 209 through 215. Notice that if http
is already in self.proxy_manager
we don't do anything. Since it is, the new proxy is ignored. I haven't attempted to test if that's in fact the case, but that's my first guess as to what might be causing the behaviour that @asmeurer is seeing.
As for 'prompting' for proxy auth, that's not going to happen, it's simply not requests' job.
I agree.
As for 'prompting' for proxy auth, that's not going to happen, it's simply not requests' job.
That's unfortunate. It seems that a lot of http libraries take this view, which is why every application that connects to the internet has to implement its own proxy authentication (I don't use proxies, but I've noticed this pattern).
Any thoughts on how requests could make this easier? A callback mechanism?
Requests should absolutely allow you to apply auth if you hadn't before.
As for making it easier to make the same request with auth for the proxy, the reason HTTP libraries generally don't is because we don't know what people are going to want. In requests case, this is sufficiently unusual that adding a mechanism would have two effects:
For that reason, we assume that users who need to re-make a request will do it themselves.
The auth stuff makes that harder, so we should fix it, but otherwise I think that's the end of it.
So I want to be sure I'm understanding everyone's concerns here properly:
@asmeurer when you say you'd like requests to prompt for proxy auth, do you mean you'd like us to literally use raw_input
(or input
in the case of Python 3)? I'm pretty sure that's not what you want and that's not something we'll ever support. Further, I'm not quite certain how we would properly implement a callback system for this particular case since the only other system like it in requests relies on having a response which we don't have in this case.
That said, we've had a troubled history (which @Lukasa knows much better than I) dealing with HTTPS Proxy Authentication. If there were a better way of handling them, we would have already implemented it (I'd like to think).
This discussion should remain in this issue. Your other problem @asmeurer (in which mutating a Session's proxies does not affect subsequent requests should be a separate issue). I'm trying to think of a good way to handle that case since I think I've located the problem above.
Sure I'll open a new issue for the Session thing.
The situation with HTTPS proxy authentication remains like this:
httplib
to send that first request, but by using its tunnel
functionality.I appreciate this is unfortunate, but there's simply no way around it: it needs to be done the way it's currently being done, or we need to special-case the Proxy Auth handler to mean something special. I don't want to do that because the Proxy Auth handler is semantically out of place: it applies authentication to the proxy, not to the origin. Given that you may also want to authenticate the origin, and that we don't allow multiple auth handlers, applying the Proxy Auth handler impedes your ability to do any other kind.
Best to have proxy authentication credentials come in on the proxies
dictionary.
Ok, I agree that this is the wrong place for it. ProxyAuth did seem strange to me, for the reason you cited.
Not sure if this discussion is resolved or not, but please ping me again if I still have an action item/question. I will be home tomorrow for more in-depth reading. :)
@Lukasa If this is still an open question, handling special-handling 407 in urllib3 sounds sensible if we can do it in a low-impact way.
My understanding of what is going on is limited here, but it didn't seem to me like there were any concrete action items yet.
Yeah, we're still trying to drill down into exactly what's happening where.
Hello guys. I tried to look around in the repo to find some answers but wasn't lucky. Is there any workaround to do a get
request using a HTTPS proxy ?
I'm getting this error
requests.exceptions.ProxyError: HTTPSConnectionPool(host='httpbin.org', port=443): Max retries exceeded with url: /ip (Caused by ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x0000021DB2CDF668>: Failed to establish a new connection: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond',)))
and this is what I'm trying to do to test.
import requests
url = 'https://httpbin.org/ip'
prox = 'https://89.36.195.238:35328'
proxies = {
"https": prox
}
response = requests.get(url,proxies=proxies)
print(response.json())
I'm trying to add support for proxies to conda, in particular, automatically prompting for a username and password on a 407. For http proxies, it's all fine. I can detect the 407 error code on
HTTPError
fromraise_for_status
and injectHTTPProxyAuth
and try again.But for https, it raises
ConnectionError
onget
. AddingHTTPProxyAuth
does not work. I have to parse the URL and add it in ashttps://username:password@proxy:port
.urlparse
does not make this particularly easy.Furthermore, the only way I can tell to actually detect the the
ConnectionError
is a 407 is to do