Closed dae closed 1 year ago
Fixed and published, thank you :pray:
So, I'm not sure this was the right thing. It causes this crate to report the IP address of the most recent proxy, not the IP of the client. The syntax of this header is:
X-Forwarded-For: <client>, <proxy1>, <proxy2>
(ref: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#syntax)
Also note that using X-Forwarded-For is inherently not very trustworthy anyway, and picking any IP address from the header is just as unsafe as any other -- depending on the deployment circumstances. I've pinned back to 0.3.0 to fix this for me, but would be happy to use a different crate if this is the intended change.
When I looked at this issue for the first time, I thought we could make it possible to change the extractor behaviour. So people can choose where to look for the IP and in what order. Maybe we should do it after all?
What comes to my mind is to implement a trait IPExtractor
for each struct XRealIP
, XForwardedForLeft
, XForwardedForRight
, etc. And also implement FromRequestParts
for tuples of IPExtractor
. Then a user could just create a type alias to use whatever order they want:
type ClientIp = (XRealIP, XForwardedForRight);
async fn handler(ip: ClientIp) { ... }
What do you think?
Being able to extract individual parsed headers instead of having one extractor that runs through multiple headers would be an improvement IMHO. Not as sure about right/left - perhaps there are users out there with multi-hop proxy setups that want to get at one in the middle? What if you just exposed all collected IPs in a vec and let the user narrow them down as required?
extract individual parsed headers instead
Do you mean also implementing FromRequestParts
for XRealIP
etc? Yes we can do it of course. But still most projects are probably don't want to rely on a particular proxy setup. So they would prefer a sequence of extractors.
What if you just exposed all collected IPs in a vec
Do you mean specifically for x-forwarded-for
? We can export a helper function for this of course. But we still need XForwardedFor
extractor to return Option<IpAddr>
and so be chainable with other extractors.
This is a bit of a complicated topic, and I think probably there is no uniformly correct way for this crate to do it without ahead-of-time configuration of, say, a list of trusted proxy IPs. The MDN page has a lot of good info here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#selecting_an_ip_address
probably don't want to rely on a particular proxy setup. So they would prefer a sequence of extractors.
I only have experience with a controlled environment where I know I only want to be inspecting one header, so I'm not sure I'm qualified to comment on the sequence approach. But for what it's worth, as this is an area that is inherently complex, I feel like the "Rusty" approach might be to expose that complexity to the user and force them to think about it.
One other thing I just noticed is that x-forwarded-for is being extracted with headermap.get(), which appears to return the first instance of the header. When only the last hop is trusted, it's the rightmost IP from the last header that is desirable instead: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-Forwarded-For#parsing
I tried to implement that tuple-based extractor, but I stumbled upon the orphan rule and couldn't find a way keep the api as simple as it is right now.
So I ended up with a bening mutex-based global variable to keep the sequence of ip extractors. Let me know if you have any implementation suggestions: https://github.com/imbolc/axum-client-ip/compare/main...global-extractor-sequence
Also I guess we should reorder the default sequence to look for the X-Real-Ip
first as the most trustworthy one. Initially I made the sequence based on probable usage frequency.
And also I think it's better to use official X-Forwarded-For
interpretation targeting the first IP by default as @grahamc suggests. I'm going add a maybe_x_forward_for_last
extractor, so anyone preferring this option could reset that global variable with a new sequence.
We can think about universality of code (ability to be used behind different proxies or without them) from different perspective.
What if we require initialization of a module explicitly passing it a knowledge what header to use. I mean in any particular configuration users would know which (if any) proxy they're behind. This way no code change (as in tuple-based implementation I initially suggested) is needed to move to another proxy configuration:
mod axum_client_ip {
#[derive(serde::Deserialize)]
pub enum ClientIpHeader {
NoProxy,
XForwardedFor,
XRealIP,
}
pub fn init(header: ClientIpHeader) {
// save it to once_cell or something
}
}
#[derive(serde::Deserialize)]
struct Config {
client_ip_header: axum_client_ip::ClientIpHeader,
}
fn main() {
dotenv::dotenv().ok();
let config: Config = envy::from_env().unwrap();
axum_client_ip::init(config.client_ip_header);
}
Also we can address both needs for different kind of IPs mentioned in the article from the first message (secure one used for things like ratelimiter and unsecure for geolocation) renaming the current ClientIp
to UnsecureClientIp
and using for SecureClientIp
strategy I described in the previous message.
As a library consumer I'd be a bit surprised to see it using global state, and that would prevent different endpoints from using different IP configurations (eg certain endpoints may arrive via a different reverse proxy with a different number of hops). Personally I'd feel more comfortable if this library just provided the tools to extract the info from the headers in a type-safe way, and then left it up to the user to decide how to deal with the extracted info. But maybe I'm in the minority here. :-)
As a library consumer I'd be a bit surprised to see it using global state
Agreed, we can keep the setting in an axum extension
and that would prevent different endpoints from using different IP configuration
Yeah, but I'm still more for a less complicated solution which fits needs of 99% of users.
Personally I'd feel more comfortable if this library just provided the tools to extract the info from the headers in a type-safe way
I'm ok with adding such tools. What api would you prefer? Extractors? Something like struct MaybeLastForwardedFor(Option<IpAddr>)
and struct LastForwardedFor(IpAddr)
?
and then left it up to the user to decide how to deal with the extracted info
It would make an app depended on a particular proxy setup. So to change e.g. cloud provider one should change the app code. I doesn't seem right. And also I believe most people aren't aware of this complexity, so broken crates like this exist :) Isn't it better to give them maybe not perfect, but still relatively safe option?
What api would you prefer? Extractors? Something like struct MaybeLastForwardedFor(Option
) and struct LastForwardedFor(IpAddr)?
If you're interested in providing individual extractors for users who don't want to rely on the fallback logic, I still think the cleanest/most flexible API would be ForwardedFor(Vec
In the mean time, I solved my immediate problem by adjusting the reverse proxy to overwrite x-forwarded-for, so I don't need to worry about which end of the list axum-client-ip is picking.
It would make an app depended on a particular proxy setup. So to change e.g. cloud provider one should change the app code. I doesn't seem right.
If the user of the library hard-coded a particular offset in the ip list, then yes, they would need to recompile if the environment changed. If that was a concern, they could define the offset and possible other fallback headers in a config file, like in your example above.
If you plan to keep the fallback system, I do think it may be a good idea to require users to specify the fallback order they want and not provide a default, unless you are committed to always using the same order for the life of the library. If down the line you decided to switch the order headers are used in, I suspect that's going to bite some people when they update without realising the change.
If you plan to keep the fallback system
It's not a fallback system. In the article you've provided there's a distinction between two kinds of IPs an app could use for different purposes:
x-forwarded-for
or forwarded-for
headers. Used for eg geolocation.So I'm thinking to provide access to both kind of IPs. And the "fallback system" is related to the second kind only. By its nature it doesn't provide guarantees but rather a best guess. So I don't see why not to use a hardcoded list of sources which could be changed between releases?
The code tries X-forwarded-for, and "falls back" to other headers if it's missing - is that not a fallback system? :-)
If a library user is rate limiting based on IP address, which headers they can trust, and which parts of the header they can trust, depends on their configured environment. If you change which header is picked first in a future release, a client-provided header that is not set by the reverse proxy may end up being used instead of the trusted header, and suddenly users are able to spoof their IP to circumvent rate limits.
(if you're saying you intend to provide an API for the former use case that only tries a single header, then that avoids the problem)
Also we can address both needs for different kind of IPs mentioned in the article from the first message (secure one used for things like ratelimiter and unsecure for geolocation) renaming the current
ClientIp
toUnsecureClientIp
and using forSecureClientIp
strategy I described in the previous message.
Right, for SecureClientIp
(which is meant to be use in things like rate limiting) it's required to provide a single source header. And for UnsecureClientIp
it feels ok to use a hardcoded list of sources. Because I don't really see on what criteria anyone would want to change it, as we can't control external proxies traffic passes.
@dae I've added that rudimental extractors, would you have any suggestions before I merge it? https://github.com/imbolc/axum-client-ip/blob/rudimental-extractors/src/rudimental.rs
It looks like that code addresses the concerns I raised, thanks. One minor issue I noticed: the rightmost variants have comments referring to 'leftmost'.
So the new api includes InsecureClientIp
(@grahamc it uses the old logic an probably fits your needs), SecureClientIp
and header specific extractors. The code is merged into the main branch. Let me know if there's anything you'd like to change before the release.
Minor style note in the README: "except for" would be clearer written as "apart from".
My use case is covered by the extractors in rudimental.rs, and they LGTM. Thank you for the effort you've put into this. :-)
It's published. Thank you, @dae :pray:
Currently .find_map() is used, causing the first IP to be returned, which appears can't be trusted. Would it perhaps be better to use the IP appended via the last proxy, eg when using nginx's $proxy_add_x_forwarded_for? A site with some more info: https://adam-p.ca/blog/2022/03/x-forwarded-for/.