Should the last IP be extracted from X-Forwarded-For instead of the first?

dae commented 1 year ago

Currently .find_map() is used, causing the first IP to be returned, which appears can't be trusted. Would it perhaps be better to use the IP appended via the last proxy, eg when using nginx's $proxy_add_x_forwarded_for? A site with some more info: https://adam-p.ca/blog/2022/03/x-forwarded-for/.

imbolc commented 1 year ago

Fixed and published, thank you :pray:

grahamc commented 1 year ago

So, I'm not sure this was the right thing. It causes this crate to report the IP address of the most recent proxy, not the IP of the client. The syntax of this header is:

X-Forwarded-For: <client>, <proxy1>, <proxy2>

(ref: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#syntax)

Also note that using X-Forwarded-For is inherently not very trustworthy anyway, and picking any IP address from the header is just as unsafe as any other -- depending on the deployment circumstances. I've pinned back to 0.3.0 to fix this for me, but would be happy to use a different crate if this is the intended change.

imbolc commented 1 year ago

When I looked at this issue for the first time, I thought we could make it possible to change the extractor behaviour. So people can choose where to look for the IP and in what order. Maybe we should do it after all?

What comes to my mind is to implement a trait IPExtractor for each struct XRealIP, XForwardedForLeft, XForwardedForRight, etc. And also implement FromRequestParts for tuples of IPExtractor. Then a user could just create a type alias to use whatever order they want:

type ClientIp = (XRealIP, XForwardedForRight);

async fn handler(ip: ClientIp) { ... }

What do you think?

dae commented 1 year ago

Being able to extract individual parsed headers instead of having one extractor that runs through multiple headers would be an improvement IMHO. Not as sure about right/left - perhaps there are users out there with multi-hop proxy setups that want to get at one in the middle? What if you just exposed all collected IPs in a vec and let the user narrow them down as required?

imbolc commented 1 year ago

extract individual parsed headers instead

Do you mean also implementing FromRequestParts for XRealIP etc? Yes we can do it of course. But still most projects are probably don't want to rely on a particular proxy setup. So they would prefer a sequence of extractors.

What if you just exposed all collected IPs in a vec

Do you mean specifically for x-forwarded-for? We can export a helper function for this of course. But we still need XForwardedFor extractor to return Option<IpAddr> and so be chainable with other extractors.

grahamc commented 1 year ago

This is a bit of a complicated topic, and I think probably there is no uniformly correct way for this crate to do it without ahead-of-time configuration of, say, a list of trusted proxy IPs. The MDN page has a lot of good info here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#selecting_an_ip_address

dae commented 1 year ago

probably don't want to rely on a particular proxy setup. So they would prefer a sequence of extractors.

I only have experience with a controlled environment where I know I only want to be inspecting one header, so I'm not sure I'm qualified to comment on the sequence approach. But for what it's worth, as this is an area that is inherently complex, I feel like the "Rusty" approach might be to expose that complexity to the user and force them to think about it.

One other thing I just noticed is that x-forwarded-for is being extracted with headermap.get(), which appears to return the first instance of the header. When only the last hop is trusted, it's the rightmost IP from the last header that is desirable instead: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#parsing

imbolc commented 1 year ago

I tried to implement that tuple-based extractor, but I stumbled upon the orphan rule and couldn't find a way keep the api as simple as it is right now.

So I ended up with a bening mutex-based global variable to keep the sequence of ip extractors. Let me know if you have any implementation suggestions: https://github.com/imbolc/axum-client-ip/compare/main...global-extractor-sequence

Also I guess we should reorder the default sequence to look for the X-Real-Ip first as the most trustworthy one. Initially I made the sequence based on probable usage frequency.

And also I think it's better to use official X-Forwarded-For interpretation targeting the first IP by default as @grahamc suggests. I'm going add a maybe_x_forward_for_last extractor, so anyone preferring this option could reset that global variable with a new sequence.

imbolc commented 1 year ago

We can think about universality of code (ability to be used behind different proxies or without them) from different perspective.

What if we require initialization of a module explicitly passing it a knowledge what header to use. I mean in any particular configuration users would know which (if any) proxy they're behind. This way no code change (as in tuple-based implementation I initially suggested) is needed to move to another proxy configuration:

mod axum_client_ip {
    #[derive(serde::Deserialize)]
    pub enum ClientIpHeader {
        NoProxy,
        XForwardedFor,
        XRealIP,
    }

    pub fn init(header: ClientIpHeader) {
        // save it to once_cell or something
    }
}

#[derive(serde::Deserialize)]
struct Config {
    client_ip_header: axum_client_ip::ClientIpHeader,
}

fn main() {
    dotenv::dotenv().ok();
    let config: Config = envy::from_env().unwrap();
    axum_client_ip::init(config.client_ip_header);
}

imbolc commented 1 year ago

Also we can address both needs for different kind of IPs mentioned in the article from the first message (secure one used for things like ratelimiter and unsecure for geolocation) renaming the current ClientIp to UnsecureClientIp and using for SecureClientIp strategy I described in the previous message.

dae commented 1 year ago

As a library consumer I'd be a bit surprised to see it using global state, and that would prevent different endpoints from using different IP configurations (eg certain endpoints may arrive via a different reverse proxy with a different number of hops). Personally I'd feel more comfortable if this library just provided the tools to extract the info from the headers in a type-safe way, and then left it up to the user to decide how to deal with the extracted info. But maybe I'm in the minority here. :-)

imbolc commented 1 year ago

As a library consumer I'd be a bit surprised to see it using global state

Agreed, we can keep the setting in an axum extension

and that would prevent different endpoints from using different IP configuration

Yeah, but I'm still more for a less complicated solution which fits needs of 99% of users.

Personally I'd feel more comfortable if this library just provided the tools to extract the info from the headers in a type-safe way

I'm ok with adding such tools. What api would you prefer? Extractors? Something like struct MaybeLastForwardedFor(Option<IpAddr>) and struct LastForwardedFor(IpAddr)?

and then left it up to the user to decide how to deal with the extracted info

It would make an app depended on a particular proxy setup. So to change e.g. cloud provider one should change the app code. I doesn't seem right. And also I believe most people aren't aware of this complexity, so broken crates like this exist :) Isn't it better to give them maybe not perfect, but still relatively safe option?

dae commented 1 year ago

What api would you prefer? Extractors? Something like struct MaybeLastForwardedFor(Option) and struct LastForwardedFor(IpAddr)?

If you're interested in providing individual extractors for users who don't want to rely on the fallback logic, I still think the cleanest/most flexible API would be ForwardedFor(Vec), where the returned IPs were extracted from all of the X-Forwarded-For headers. But I'm just one person, so maybe you should wait for other people to chime in before providing that.

In the mean time, I solved my immediate problem by adjusting the reverse proxy to overwrite x-forwarded-for, so I don't need to worry about which end of the list axum-client-ip is picking.

It would make an app depended on a particular proxy setup. So to change e.g. cloud provider one should change the app code. I doesn't seem right.

If the user of the library hard-coded a particular offset in the ip list, then yes, they would need to recompile if the environment changed. If that was a concern, they could define the offset and possible other fallback headers in a config file, like in your example above.

If you plan to keep the fallback system, I do think it may be a good idea to require users to specify the fallback order they want and not provide a default, unless you are committed to always using the same order for the life of the library. If down the line you decided to switch the order headers are used in, I suspect that's going to bite some people when they update without realising the change.

imbolc commented 1 year ago

If you plan to keep the fallback system

It's not a fallback system. In the article you've provided there's a distinction between two kinds of IPs an app could use for different purposes:

A spoof-protected IP - is essentially an IP of the connection to the last proxy. Used for eg rate limiting.
An usecure, but more probably a real client IP, leftmost from x-forwarded-for or forwarded-for headers. Used for eg geolocation.

So I'm thinking to provide access to both kind of IPs. And the "fallback system" is related to the second kind only. By its nature it doesn't provide guarantees but rather a best guess. So I don't see why not to use a hardcoded list of sources which could be changed between releases?

dae commented 1 year ago

The code tries X-forwarded-for, and "falls back" to other headers if it's missing - is that not a fallback system? :-)

If a library user is rate limiting based on IP address, which headers they can trust, and which parts of the header they can trust, depends on their configured environment. If you change which header is picked first in a future release, a client-provided header that is not set by the reverse proxy may end up being used instead of the trusted header, and suddenly users are able to spoof their IP to circumvent rate limits.

dae commented 1 year ago

(if you're saying you intend to provide an API for the former use case that only tries a single header, then that avoids the problem)

imbolc commented 1 year ago

Also we can address both needs for different kind of IPs mentioned in the article from the first message (secure one used for things like ratelimiter and unsecure for geolocation) renaming the current ClientIp to UnsecureClientIp and using for SecureClientIp strategy I described in the previous message.

Right, for SecureClientIp (which is meant to be use in things like rate limiting) it's required to provide a single source header. And for UnsecureClientIp it feels ok to use a hardcoded list of sources. Because I don't really see on what criteria anyone would want to change it, as we can't control external proxies traffic passes.

imbolc commented 1 year ago

@dae I've added that rudimental extractors, would you have any suggestions before I merge it? https://github.com/imbolc/axum-client-ip/blob/rudimental-extractors/src/rudimental.rs

dae commented 1 year ago

It looks like that code addresses the concerns I raised, thanks. One minor issue I noticed: the rightmost variants have comments referring to 'leftmost'.

imbolc commented 1 year ago

So the new api includes InsecureClientIp (@grahamc it uses the old logic an probably fits your needs), SecureClientIp and header specific extractors. The code is merged into the main branch. Let me know if there's anything you'd like to change before the release.

dae commented 1 year ago

Minor style note in the README: "except for" would be clearer written as "apart from".

My use case is covered by the extractors in rudimental.rs, and they LGTM. Thank you for the effort you've put into this. :-)

imbolc commented 1 year ago

It's published. Thank you, @dae :pray:

imbolc / axum-client-ip

Should the last IP be extracted from X-Forwarded-For instead of the first? #6