psf / requests

A simple, yet elegant, HTTP library.
https://requests.readthedocs.io/en/latest/
Apache License 2.0
52.16k stars 9.33k forks source link

Session's Authorization header isn't sent on redirect #2949

Closed jwineinger closed 3 years ago

jwineinger commented 8 years ago

I'm using requests to hit developer-api.nest.com and setting an Authorization header with a bearer token. On some requests, that API responds with an 307 redirect. When that happens, I still need the Authorization header to be sent on the subsequent request. I've tried using requests.get() as well as a session.

I suppose I could work around this by not allowing redirects, detecting the 307 and then issuing the new request myself but I'm wondering if this is a bug. Should I expect that the Authorization header would be sent on all requests made within the context of a session?

In [41]: s = requests.Session()

In [42]: s.headers
Out[42]: {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'keep-alive', 'User-Agent': 'python-requests/2.7.0 CPython/3.4.3 Darwin/15.2.0'}

In [43]: s.headers['Authorization'] = "Bearer <snip>"

In [45]: s.get("https://developer-api.nest.com/devices/thermostats/")
Out[45]: <Response [401]>

In [46]: s.get("https://developer-api.nest.com/devices/thermostats/")
Out[46]: <Response [200]>

In [49]: Out[45].history
Out[49]: [<Response [307]>]

In [50]: Out[46].history
Out[50]: []

In [51]: Out[45].request.headers
Out[51]: {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'keep-alive', 'User-Agent': 'python-requests/2.7.0 CPython/3.4.3 Darwin/15.2.0'}

In [52]: Out[46].request.headers
Out[52]: {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 'Connection': 'keep-alive', 'User-Agent': 'python-requests/2.7.0 CPython/3.4.3 Darwin/15.2.0', 'Authorization': 'Bearer <snip>'}
Lukasa commented 8 years ago

Where is the redirect to?

jwineinger commented 8 years ago

Ah, a different domain. firebase-apiserver03-tah01-iad01.dapi.production.nest.com

Lukasa commented 8 years ago

Yup, so that's somewhat deliberate: we're very aggressive with stripping authorization headers when redirected to a new host. This is a safety feature to deal with CVE 2014-1829, which was caused by us persisting headers on off-host redirects.

However, from a certain perspective we've still got a bug here, because you set the Authorization header on the Session, not the request. In principle, what this means is "I don't care where the redirect goes, add the header". I still think I'd rather have this approach, which at least ensures that we're not open to any form of attack, even if it makes this specific instance somewhat trickier. However, I'm open to being convinced that we're being too paranoid here.

sigmavirus24 commented 8 years ago

However, I'm open to being convinced that we're being too paranoid here.

I'm less open to being convinced but willing to listen.

That said, a separate Auth mechanism could be written to persist such headers across allowed domains which would necessitate us doing some work in rebuild_auth.

jwineinger commented 8 years ago

I wont argue with the safety of how it works right now. It would be nice to have some mechanism to opt into the "unsafe" behavior though. Perhaps setting a base domain to persist those headers to (nest.com, in this case) or perhaps a list of domains that are OK to send them to.

sigmavirus24 commented 8 years ago

Yeah that's far more complexity than the core of requests will ever provide though. That's why I'm wondering if a separate Auth class/handler might work best for this sort of thing. I'm not convinced it will work though because I'm fairly certain that we do not unconditionally call prepare_auth.

Lukasa commented 8 years ago

It won't work in the standard model because we don't unconditionally call prepare_auth. However, a Transport Adapter could be used to fulfil this role, even if it is a slightly unusual use of that API.

sigmavirus24 commented 8 years ago

I think a TA is absolutely the wrong thing to recommend here though.

kennethreitz commented 8 years ago
sigmavirus24 commented 8 years ago

If auth is provided to a session, it should be sent for every request that session makes.

I fundamentally disagree. Sessions aren't used for a single domain, if they were, I'd have no problem with this.

Perhaps we should remove session.auth. It's not particularly useful.

I think it is useful. I think it would be better if assigning a tuple to it were not allowed though. I'd rather see a Auth class that specifies which domains to use it for. We could just adopt the AuthHandler from requests-toolbelt which allows people to specify credentials for a domain when using requests. This provides a slightly more secure way of handling Session based authentication. The downside is that it requires users opt-in to that kind of authentication though.

jtherrmann commented 8 years ago

I also need this to be fixed so I can make headers persist for redirects.

Lukasa commented 8 years ago

@jtherrmann If this is an auth header, the easiest way to work around the problem is to set a session-level auth handler that simply always puts the header you want on the request.

ethanroy commented 8 years ago

Has any progress or additional consideration been given to this? I am running in to the same issue.

Lukasa commented 8 years ago

@ethanroy No additional consideration beyond my suggestion of using a session-level auth handler, in the comment directly above yours.

GregBakker commented 8 years ago

Related: if a session redirects and strips auth, calling get again reapplies the auth and uses the cached redirect. So knock twice and you get in. Intended behaviour?

>>> s = requests.Session()
>>> s.headers.update({"Authorization": "Token {}".format(API_TOKEN)})
>>> s.get(url)

<Response [403]>

>>> s.get(url)

<Response [200]>
Lukasa commented 8 years ago

@GregBakker Yes, ish. It's a confluence of intended behaviours. However, this bug notes that the original 403 shouldn't happen.

reteptilian commented 8 years ago

@Lukasa when you say "the easiest way to work around the problem is to set a session-level auth handler," is that something that works today? Based on what I'm seeing in the code, the answer is no but your wording makes me wonder if I'm missing something. You're talking about setting the Session auth attribute, right?

Lukasa commented 8 years ago

Yeah, that should work.

eatm1 commented 8 years ago

@jwineinger so how did you end up getting around this problem? it still seems to behave the same.

technicalpickles commented 8 years ago

There's two Nest-specific workarounds.

One is to pass the auth parameter with the access_token rather than using the Authorization header. I found this on https://gist.github.com/tylerdave/409ffa08e1d47b1a1e23

Another is to save a dictionary with the headers you'd use, don't follow redirects, and then make a second request passing in the headers again:

    headers = {'Authorization': 'Bearer ' + access_token, 'Content-Type': 'application/json'}
    initial_response = requests.get('https://developer-api.nest.com', headers=headers, allow_redirects=False)
    if initial_response.status_code == 307:
        api_response = requests.get(initial_response.headers['Location'], headers=headers, allow_redirects=False)
gabriel-loo commented 7 years ago

I encountered this same problem and got around it by overriding the rebuild_auth method in a custom requests.Session implementation:

from requests import Session

class CustomSession(Session):
    def rebuild_auth(self, prepared_request, response):
        return

s = CustomSession()
s.get(url, auth=("username", "password"))
j08lue commented 6 years ago

@sigmavirus24 what is wrong with @gabriel-loo's solution? Security concerns?

sigmavirus24 commented 6 years ago

@j08lue yes. Please read the thread. There are CVE's associated with not stripping authentication before following arbitrary redirects to a new domain. Think about the problem this way:

I'm making requests to api.github.com and an attacker manages to make me follow a redirect to another-domain.com that they control and I pass along my token with write access to my repositories (including requests) then it can appear as if I'm making commits to requests when in fact they are making those commits via the API. They can include code in Requests that will weaken its security posture and possibly actively harm you. That's what could happen when you unconditionally send your authentication credentials on every redirect.

Even so, let's say the redirect isn't malicious, are you actually comfortable leaking your credentials for a service to another company or service? The original service may store confidential data for you, your customers, or something else. Even if the new domain that you've been redirected to doesn't use your credentials but potentially logs them as unexpected data, someone who attacks them and can retrieve those logs can then use your credentials against the original domain if they can puzzle out where they belong. Are you really willing to take that risk?

j08lue commented 6 years ago

Thanks for the illustration, @sigmavirus24. If this concern ultimately prohibits forwarding sensitive headers to redirects, then why is this thread still open? I could not think of a more appropriate error than the one you get (403), so there is no bug need for action here, is there? Or what did you have in mind, @Lukasa?

tlantz commented 6 years ago

I was hitting this issue recently when working with a non-public API. The security concerns totally make sense as a reason for stripping auth out on redirects. I think a solution like @gabriel-loo's is something folks can consider if they believe they're in a safe enough environment to do so. Or the session level handler. Or find another way to work around by skipping the redirect entirely as suggested above, if that's possible. So in line with the view this isn't really a bug.

However, I burned more time than I probably needed to confused about why a handful of other non-Python HTTP clients did pass on the auth header and were working fine when this was not. One suggestion: it might be nice to issue a warning via warnings here to make it more clear to callers when the header is present and being stripped. I'd imagine it's rare that this is something a caller would not want to be warned about.

sigmavirus24 commented 6 years ago

@tlantz normally that would seem pretty reasonable. Requests as a project (as well as urllib3, one of its dependencies) has caused a significant amount of ire when it issues any sort of warning whether via the warnings module or via logging. Further, the warnings module is for things that people should take action on, for example, not using a version of Python that has been compiled against a recent version of OpenSSL.

In most cases, this behaviour isn't as problematic as, for example, being unable to verify a ceritificate for a TLS connection. That obviously doesn't help you or anyone else who has expressed their genuine and valid frustration on this issue. With that in mind, I wonder if it wouldn't be better to attempt to log this at the DEBUG level. If someone is using logging (generally a decent practice) and enables that level it show up for them. Futher, given how little Requests itself logs, this will be fairly prominent as a debug log. Does that seem like a fair trade-off?

tlantz commented 6 years ago

Yeah, that seems like a totally fair tradeoff. Reasoning around warnings makes sense to me. I think by the time you've been puzzled for 30 minutes or so you're usually adding logging around your own stuff anyway at DEBUG, so I think a DEBUG message would hit 95% of the cases where people are stuck trying to figure out what's not working.

zhiyuli commented 6 years ago

I use a session to hold the Authorization header, but it is not being sent in a redirect requests (2.18.4)

ndmeiri commented 6 years ago

A coworker and I spent at least a couple hours debugging an issue directly related to this behavior. My use case is redirecting an API call from api.my-example-site.org to www.api.my-example-site.org. The headers were being stripped on the redirect.

If this is intended behavior (or if it won't be changed in the near future), can we please at least add it to the documentation? I read and re-read the docs trying to figure out what I was doing incorrectly, and I even read through all the code in the Request class. If I had seen a warning about this behavior in the documentation, I would have fixed my issue in a couple minutes (which is the time it took after I found this thread). Perhaps we were reading in the wrong part of the documentation, however.

nateprewitt commented 6 years ago

Hi @ndmeiri, we do have a call out on this in the quick-start guide for Requests under the Custom Headers heading. If you feel there's a better place to put this, we're happy to review any suggestions you have. I would prefer we move that to a separate issue or PR though since it's not directly related to this ticket. Thanks!

ndmeiri commented 6 years ago

Hi @nateprewitt, thanks for pointing out the Custom Headers section! Evidently, I hadn't thought to check that part of the documentation.

I think it would be helpful to also include the call out, or a reference to the call out, in the section on Authentication. Although I'm currently fairly busy, I'll consider opening a PR when things calm down to update the docs.

sigmavirus24 commented 6 years ago

If this is intended behavior

@ndmeiri Yes, it is intended behaviour to not leak your sensitive authentication credentials to potentially untrusted sources. (Just to be clear)

BrambleRamble commented 4 years ago

It appears that the "trusted domains" from #4983 are no longer in the implementation of sessions.py.

In the situation where I'm making requests to a URL that I know redirects to a particular different-but-safe URL, and I want to enable redirection with persistence of the Authorization header, how would I achieve that please?

j08lue commented 4 years ago

how would I achieve that please?

You can patch the rebuild_auth method. This works for me: https://github.com/DHI-GRAS/earthdata-download/blob/master/earthdata_download/download.py#L27-L49

BrambleRamble commented 4 years ago

@j08lue Thanks! Before your comment came through, I worked around the issue by setting allow_redirects to False, and adding code to explicitly follow the few, specific redirects that are expected in my use case. This is a short-term situation for me, so I'm hopefully that this is an adequate temporary solution, but it's great to know that there's a better way to do it in the longer term if needed.

digglife commented 3 years ago

I guess it's reasonable to add an option to disable this?

sigmavirus24 commented 3 years ago

No. It's not really desirable or reasonable to give users an option to send their credentials to untrusted sources. As commented above, there are ways to do the wrong thing, but we don't want to make it easy