r-lib / httr2

Make HTTP requests and process their responses. A modern reimagining of httr.
https://httr2.r-lib.org
Other
235 stars 56 forks source link

Allow users to persist authentication headers during redirects #475

Closed botan closed 3 months ago

botan commented 3 months ago

Thank you for the fantastic package!

I'm having issues when sending requests with authentication headers if the server redirects the request since req_perform() clears the authentication header. I'm not sure if this is a bug or a security feature, but it is inconvenient for use cases where users trust the redirection targets. It would be great if users could explicitly specify req_perform(preserve_auth = TRUE) to maintain the authentication state during redirects, with the default being FALSE if needed for security reasons.

library(httr2)

request("https://httpbin.org/bearer") |> 
  req_auth_bearer_token("TOKEN") |> 
  req_perform()
#> <httr2_response>
#> GET https://httpbin.org/bearer
#> Status: 200 OK
#> Content-Type: application/json
#> Body: In memory (49 bytes)

request("http://httpbin.org/redirect-to?url=https://httpbin.org/bearer") |> 
  req_auth_bearer_token("TOKEN") |> 
  req_perform(verbosity = 1)
#> -> GET /redirect-to?url=https: //httpbin.org/bearer HTTP/1.1
#> -> Host: httpbin.org
#> -> User-Agent: httr2/1.0.1 r-curl/5.2.1 libcurl/8.8.0
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip, br, zstd
#> -> Authorization: <REDACTED>
#> -> 
#> <- HTTP/1.1 302 FOUND
#> <- Date: Sun, 02 Jun 2024 21:39:42 GMT
#> <- Content-Type: text/html; charset=utf-8
#> <- Content-Length: 0
#> <- Connection: keep-alive
#> <- Server: gunicorn/19.9.0
#> <- Location: https://httpbin.org/bearer
#> <- Access-Control-Allow-Origin: *
#> <- Access-Control-Allow-Credentials: true
#> <- 
#> -> GET /bearer HTTP/2
#> -> Host: httpbin.org
#> -> User-Agent: httr2/1.0.1 r-curl/5.2.1 libcurl/8.8.0
#> -> Accept: */*
#> -> Accept-Encoding: deflate, gzip, br, zstd
#> -> 
#> <- HTTP/2 401 
#> <- date: Sun, 02 Jun 2024 21:39:43 GMT
#> <- content-type: text/html; charset=utf-8
#> <- content-length: 0
#> <- server: gunicorn/19.9.0
#> <- www-authenticate: Bearer
#> <- access-control-allow-origin: *
#> <- access-control-allow-credentials: true
#> <-
#> Error in `req_perform()`:
#> ! HTTP 401 Unauthorized.
#> • OAuth error
#> • :
hadley commented 3 months ago

Seems somewhat related to https://github.com/r-lib/httr/issues/626, but in your example the hostname is the same. ... Oh but the protocol is different.

Anyway, I think setting unrestrict_auth will solve your problem:

library(httr2)

request("http://httpbin.org/redirect-to?url=https://httpbin.org/bearer") |> 
  req_auth_bearer_token("TOKEN") |> 
  req_options(unrestricted_auth = 1) |> 
  req_perform(verbosity = 1)
botan commented 3 months ago

It sorted out my problem. Thank you very much!

Would you consider setting this behaviour as the default in the future? It's the default for most HTTP clients.

hadley commented 3 months ago

I do not believe it is the default because it is security risk. I'd need a strong reason to justify overriding this:

By default, libcurl only sends credentials and Authentication headers to the initial hostname as given in the original URL, to avoid leaking username + password to other sites.

botan commented 3 months ago

I see your point. I meant to refer to the other popular libraries. For instance, in Python:

>>> import requests
>>> requests.get(
...     "http://httpbin.org/redirect-to?url=https://httpbin.org/bearer",
...     headers={"Authorization": "Bearer TOKEN"},
...     ).json()
{'authenticated': True, 'token': 'TOKEN'}
>>> import httpx
>>> httpx.get(
...     "http://httpbin.org/redirect-to?url=https://httpbin.org/bearer",
...     headers={"Authorization": "Bearer TOKEN"},
...     follow_redirects=True,
...     ).json()
{'authenticated': True, 'token': 'TOKEN'}

But I appreciate the security concerns. Thanks again!