tokio-rs / axum

Ergonomic and modular web framework built with Tokio, Tower, and Hyper
19.43k stars 1.07k forks source link

Spoofable extractors are used with the knowledge of the risks #2998

Open yanns opened 1 month ago

yanns commented 1 month ago

The ticket follows the discussion in https://github.com/tokio-rs/axum/pull/2507#issuecomment-2423925900

Some extractors, like Host or Scheme, can use the values of some HTTP headers that could be spoofed by malicious users.

We should find a way to make users aware of the risks of using those extractors.

Some ideas:

bengsparks commented 1 month ago

Perhaps something along the lines of:

/// Wrap spoofable extractor
pub struct Spoofable<E>(pub E);

/// Allow `Spoofable` to be used with spoofable extractors in handlers
impl <S, E> FromRequestParts<S> for Spoofable<E> where E: FromSpoofableRequestParts<S> {

}

/// axum private trait
trait FromSpoofableRequestParts<S>: Sized {
    type Rejection: IntoResponse;

    async fn from_request_parts(
        parts: &mut Parts, 
        state: &S
    ) -> impl Future<Output = Result<Self, Self::Rejection>> + Send;
}

/// Mark `Host` as a spoofable extractor
impl <S> FromSpoofableRequestParts<S> for Host { ... } 

/// Use spoofable extractor
async fn handler(Spoofable(Host(host)): Spoofable<Host>) -> String {
    println("{host}");
}
yanns commented 1 month ago

I've made one PoC so that we can better imagine how the API would be: https://github.com/tokio-rs/axum/pull/3000

@bengsparks could you also make one for the approach you're suggesting. It seems very interesting!

mladedav commented 1 month ago

Is it possible to add on either of the extractors something like Host::unspoofable_value(&self) -> Option<String>?

I don't think host can be extracted from anything that cannot be spoofed and scheme could theoretically be extracted from connect info, but the way it is implemented now, it prefers the scheme the client used originally if the server is behind a proxy, i.e. it tries to extract from the proxy headers first which might be what the user is interested in.

If we can only return values extracted from spoofable sources, I feel like the destructuring is the nicer syntax from the current two options, but that's just my opinion. Getting rid of the Spoofable wrapper first also allows users to pass around Host in type-safe manner and we can implement Deref and Into for convenience. If we go with the first option, users would either have to call spoofable_value at every usage site or they would have to pass around a String. Implementing Into or Deref would completely circumvent forcing users to be explicit about acknowledging the spoofable scenario so that could never be added.

For completeness, would you be opposed to just having spoofable-extractors feature which would gate Host and Scheme in their current implementation? It would reduce the noise in handler signatures and users still have to opt-in, although just once for all of them and not explicitly for each use. I guess the question is if it's explicit enough.

jplatte commented 1 month ago

How about Host<WithProxyHeaders> and Host<WithoutProxyHeaders> as an alternative? I find "spoofable" sounds a bit awkward, and while the proxy thing may not sound as dangerous, it would still get people thinking.

yanns commented 1 month ago

I personal like having to change the usage site. I guess it would be very easy to have a function taking a Scheme and forgetting about the risks of using it. Being force to call spoofable_value makes sure that the person taking care of this particular implementation will be reminded of the consequences.

mladedav commented 1 month ago

How about Host<WithProxyHeaders> and Host<WithoutProxyHeaders>

I would see that as another dimension because both the proxy headers and the host header can be spoofed.

jplatte commented 1 month ago

Okay so I don't hate any of the options presented so far. They all seem a bit weird but that's almost the point, so not too surprising. @yanns, @mladedav if you can agree on a best solution, feel free to go ahead and merge the corresponding PR and close this issue.

mladedav commented 1 month ago

@yanns I personally see it as the extractor doing the "unsafe" operation. You ask for the Host and acknowledge that you know what you're doing and then you can pass that value around and use it however you want. Plus the potential for implementing some traits I mentioned before.

This should also be easier to migrate to (which might be bad since people won't have to think about every usage site of Host).

But if you really think it would be better to have users call the spoofable_value method every time, I'll yield (unless @bengsparks wants to argue for the Spoofable<T> wrapper more).

bengsparks commented 1 month ago

I personally like my solution for the fact that migration is simple and that the use-site is clearly visible.

I'd argue that outside of creating different extractors for reading different headers, there is no fool-proof way to achieve the desired goal. If a user doesn't want to read docs / ensure safety / take other precautions, then there is nothing to be done.

With my way, there is a handler-specific marker that calls for further inspection during code review instead of potentially being buried inside of the handler at any given position.

Suggestions and adjustments to the naming of Spoofable and other nomenclature in my PR are welcome :)

yanns commented 1 month ago

My only fear is about:

But I don't have strong opinion here. I guess I'm very (too) sensitive to developers not being careful about security...

yanns commented 3 weeks ago

I have the feeling that the community consensus is tending towards Spoofable<T> wrapper. To make progress, let's go for this. It's always a step in more security awareness.

yanns commented 3 weeks ago

My other comment is that the current Host extractor is misleading. Personally, I'd have used it by assuming it's only using the Host header. By making it Spoofable, it's clearer that the value van be read from different headers.

sclu1034 commented 3 weeks ago

One thing to note is that while this does raise awareness, there is no way for the user to act on that knowledge. The user can choose to not use the extractor at all, but a far more valuable ability would be to selectively choose only the non-spoofable part, which is not possible, as far as I can tell.

Most notably, I don't see any way that the extractors could be used to implement things like the common TRUSTED_PROXIES setting, where the X-Forwarded-* headers are considered safe selectively based on the source IP address.

yanns commented 2 weeks ago

@sclu1034 I agree with you. What kind of approach would you see here? Should we change those extractors to:

sclu1034 commented 2 weeks ago

Ideally, it would be a solution where the logic whether a connection can be trusted has to be implemented only once, rather than again for every extractor or handler.

So I think some kind of middleware that can dynamically strip headers would be best. At that point, the extractors could stay as they are, and simply act as if that header was never sent. And a centralized solution like that would also ensure that manual access to these headers can be trusted likewise.

bengsparks commented 2 weeks ago

+1 to the middleware suggestion by @sclu1034; the best solution would be such headers simply never arrive. I was about to cite this comment in support, but then I realised that they're also the author thereof 😄