cloudflare / pingora

A library for building fast, reliable and evolvable network services.
Apache License 2.0
20.3k stars 1.1k forks source link

Header Availability with HTTP and HTTPS Request #125

Open tstraus13 opened 3 months ago

tstraus13 commented 3 months ago

Describe the bug

I am unsure if it is my lack of understanding of how HTTP/HTTPS works but it seems I have to use different ways of getting the HOST header during an HTTP and HTTPS request. I would think it would be the same.

Pingora info

Please include the following information about your environment:

Pingora version: 0.1.0 Rust version: cargo 1.76.0 (c84b36747 2024-01-18) Operating system version: Arch Linux - Kernel 6.7.9

Steps to reproduce

During an HTTPS request I can retrieve the HOST header with the following code:

_session.req_header().uri.host().unwrap();

During an HTTP request I can retrieve the HOST header with the following code:

_session.get_header("Host").unwrap().to_str().unwrap().split(":").collect::<Vec<_>>()[0].to_string();

I do the extra split code to just get the HOST without the port.

If you try to use the HTTPS method within the HTTP request you will get an error on the unwrap and vice versa with HTTP method.

Both scenarios are during the request_filters portion of the request.

Expected results

I would expect either method to return the HOST header in either situation. Though maybe I misunderstand. I would prefer the HTTPS method of retrieving the HOST header as it seems simpler to me.

Observed results

You should receive an error when trying to unwrap or access the requested header.

Additional context

Now I am specifically looking at the HOST header but I think any other header would act the same way. This behavior seems odd to me but maybe that is just how HTTP vs HTTPS works. Thanks.

LessThanGreaterThan commented 3 months ago

Hey, this has todo with http2. When https is used it switches to http2 and uses the :authority header, you can parse it like this:

fn get_host(session: &mut Session) -> String {
    if let Some(host) = session.get_header(http::header::HOST) {
        if let Ok(host_str) = host.to_str() {
            return host_str.to_string();
        }
    }

    if let Some(host) = session.req_header().uri.host() {
        return host.to_string();
    }

    "".to_string()
}
tstraus13 commented 3 months ago

I see. Yes I was thinking something was different between the requests. Thanks for the code snippet. I will use it. So if this is expected behavior then this issue can be closed. It would be nice to keep the header access consistent between requests but I understand if not feasible or if does not make sense to do so.

tstraus13 commented 3 months ago

Also, it would be great to get all details of the downstream client, like ip address. I was unable to find the client IP address in the request filter stage. This would be important if I wanted to add that to the upstream request headers. Another thing I noticed that when I only ran a server on port 443, I have no way of knowing from code whether the request actually did come over port 443 or if it was http/https. All of these kinds of details would be nice to have during the request filter stage. Thanks. I have been enjoying messing around with pingora and trying to get it up and running in my homelab, its been a fun learning experience.

drcaramelsyrup commented 2 months ago

Note that for the ip/port, we've added a client_addr method available from the pingora_proxy::Session (#105).