seanmonstar / reqwest

An easy and powerful Rust HTTP Client
https://docs.rs/reqwest
Apache License 2.0
9.78k stars 1.1k forks source link

native-tis-alpn feature overrides request's host header #2263

Open TSDevHub opened 5 months ago

TSDevHub commented 5 months ago

Hello,

I have a reverse proxy service that is getting request and passing it to Reqwest to proxy to a different backend service. I want to preserve the incoming host header and have Reqwest client pass it to backend. Ex. request-> "my_service.com" -> "my_backend.com"

Both native-tls and rustls preserves the host header value, but when native-tls-alpn is enabled, host header is overridden to with reqwest's url's domain

Testing

#[tokio::main]
async fn main() {
    let client = reqwest::ClientBuilder::new()
        .build()
        .unwrap();

    let body = client
        .get("https://postman-echo.com/get")
        .header("host", "my_backend_service.com")
        .send()
        .await
        .unwrap()
        .text()
        .await
        .unwrap();

    println!("{}", body);
}

Using native-tis and rustls

Postman Output

  "headers": {
    "x-forwarded-proto": "https",
    "x-forwarded-port": "443",
    "host": "my_backend_service.com",
    "x-amzn-trace-id": "some_trace",
    "accept": "*/*"
  },

Can see that its using the host header I've provided

Using native-tls-alpn

Postman Output

  "headers": {
    "x-forwarded-proto": "https",
    "x-forwarded-port": "443",
    "host": "postman-echo.com",
    "x-amzn-trace-id": "some_trace",
    "accept": "*/*"
  },

Can see that host header has been overridden

Why don't you just not use native-tls-alpn?

I have a hard dependency that use Reqwest with this feature, so it gets added to my Reqwest client via feature unification

Question

Was this intended behavior for native-tls-alpn or is this some kind of bug?

Feature Request

I tried use_rustls_tls() or use_native_tls()(ref) in ClientBuilder while I have native-tls-alpn feature enabled in hopes to isolate my client, but lead to same host override behavior. I feel like these functions should isolate itself from native-tls-alpn settings. Perhaps with that isolation and newuse_native_tls_alpn() function, people can choose which specific backend to use for tls more specifically, especially when dependencies change backend tls (https://docs.rs/reqwest/latest/reqwest/tls/index.html)

seanmonstar commented 5 months ago

Hm, I searched through reqwest, and nothing adjusts the host header other than the Url. The ALPN feature also just sets that extension for the TLS handshake. I'm not sure why you're seeing that. Is Postman changing what it thinks is the host based on the TLS SNI value it receives in the handshake?

TSDevHub commented 5 months ago

Hey seanmonstar, thanks for reply I used postman here as an example and I doubt that Postman looks at SNI value and overrides host header because I also tested this using other endpoints that I control with the same results. So I am fairly certain this is client behavior.

seanmonstar commented 5 months ago

It could be something inside native-tls that gets enabled when ALPN is turned on? I've checked again, and I'm 99.9% sure that nothing else changes the actual host header. The most likely thing is something in the TLS handshake.

seanmonstar commented 5 months ago

Actually, just after writing that: it might be that with ALPN, that means it's negotiating HTTP/2. If so, that protocol does not use the host header, it uses :authority, which is based on the URL you specified.

TSDevHub commented 5 months ago

Thanks Sean for suggestions,

I followed up on your suggestion

Checking for HTTP2

It does indeed use HTTP2, however I don't understand that if HTTP2 does not need host header, why would client modify it? Shouldn't we still preserve the host header? (Ex. curl preserves host header input when using http2 curl -v -H "host: hello" --http2 https://postman-echo.com/get)

Anyways some reqwest logs which verified alpn using http2

TRACE hyper::client::pool: checkout waiting for idle connection: ("https", postman-echo.com)
2024-04-26T16:30:47.394596Z TRACE hyper::client::connect::http: Http::connect; scheme=Some("https"), host=Some("postman-echo.com"), port=None
2024-04-26T16:30:47.394751Z DEBUG hyper::client::connect::dns: resolving host="postman-echo.com"
2024-04-26T16:30:47.397583Z DEBUG hyper::client::connect::http: connecting to 52.0.118.157:443
2024-04-26T16:30:47.468225Z DEBUG hyper::client::connect::http: connected to 52.0.118.157:443
2024-04-26T16:30:47.625363Z TRACE hyper::client::client: ALPN negotiated h2, updating pool
2024-04-26T16:30:47.625553Z TRACE hyper::client::conn: client handshake Http2
2024-04-26T16:30:47.625911Z DEBUG h2::client: binding client connection
2024-04-26T16:30:47.626127Z DEBUG h2::client: client connection bound

Good news is that I tested with alpn feature on and I forced client to only use http1(ref) (hope this means use http version 1 and not http1.0 haha), and host override behavior is gone

2024-04-26T16:47:19.430144Z TRACE hyper::client::pool: checkout waiting for idle connection: ("https", postman-echo.com)
2024-04-26T16:47:19.430404Z TRACE hyper::client::connect::http: Http::connect; scheme=Some("https"), host=Some("postman-echo.com"), port=None
2024-04-26T16:47:19.430552Z DEBUG hyper::client::connect::dns: resolving host="postman-echo.com"
2024-04-26T16:47:19.445281Z DEBUG hyper::client::connect::http: connecting to 52.54.81.115:443
2024-04-26T16:47:19.515330Z DEBUG hyper::client::connect::http: connected to 52.54.81.115:443
2024-04-26T16:47:19.682169Z TRACE hyper::client::conn: client handshake Http1
2024-04-26T16:47:19.682322Z TRACE hyper::client::client: handshake complete, spawning background dispatcher task
2024-04-26T16:47:19.682681Z TRACE hyper::client::pool: checkout dropped for ("https", postman-echo.com)

Going forward