clj-commons / aleph

Asynchronous streaming communication for Clojure - web server, web client, and raw TCP/UDP
http://aleph.io
MIT License
2.54k stars 241 forks source link

Do not remove Expect header from the request #686

Closed arnaudgeiser closed 1 year ago

arnaudgeiser commented 1 year ago

Description

I don't get exactly what was the rationale about removing it from the request, but we have a use case where we need to have access to this to take a decision. [1] My proposition is to keep it on the request.

[1] : https://datatracker.ietf.org/doc/html/rfc7231#section-5.1.1

KingMob commented 1 year ago

(Pinging @dschobel, the original author of that Netty Expect: 100-Continue code. Perhaps they can shed some light on this.)

I agree the Expect header should ideally be left in, but that presents a minor problem. Oleksii may have been copying Netty, which also removes the header, so depending on the pipeline, it may or may not get stripped out.

In particular, both HttpServerExpectContinueHandler and the HttpObjectAggregator remove the header, though interestingly, Netty didn't start removing the Expect header until 2017, when HttpServerExpectContinueHandler was created with that behavior, and HttpObjectAggregator was updated to match it.

I checked the git history, but there's nothing written in the commits or comments explaining why the header is removed.

My guess is it prevents HttpObjectAggregator and HttpServerExpectContinueHandler from both attempting to send a 100-Continue response when they're on the pipeline together. We have this exact same setup when there's a max-request-body-size, since that means HttpObjectAggregator is added to the pipeline. Without stripping the header, they'd both send a 100-Continue response.

I asked the Netty Discord, and they had the same theory:

I imagine it is because they are processing the Expect, and so any subsequent handlers don't have to deal with them. Same as if a "Transfer-Encoding" handler ungzipped the gzipped content, you would expect the "TE: gzip" header to be removed.

I'm assuming this isn't for a decision about how to respond to the Expect, though, since the continue-handler is already the place for that. Would adding a custom header there work? (E.g. add x-sent-expect-header: true for looking at in the main handler?)

KingMob commented 1 year ago

(Also, if you add a custom header, X-Pect is too cool to pass up.)

arnaudgeiser commented 1 year ago

I agree the Expect header should ideally be left in, but that presents a minor problem. Oleksii may have been copying Netty, which also removes the header, so depending on the pipeline, it may or may not get stripped out.

Actually, my modification is not working, probably because of this, it got removed by Netty eventually. I guess I will eventually close my PR but I would like to keep an history of the discussion.

I'm assuming this isn't for a decision about how to respond to the Expect, though, since the continue-handler is already the place for that. Would adding a custom header there work? (E.g. add x-sent-expect-header: true for looking at in the main handler?)

Unfortunately not, it's a pretty special use case tied to the S3 protocol. When performing authentification on S3 through the V4 signature, some headers are signed on the authorization, here is an example :

Authorization: AWS4-HMAC-SHA256 Credential=EXO1d6c602ace788c63e898b3ab/20230803/ch-dk-2/s3/aws4_request, SignedHeaders=content-length;content-md5;expect;host;x-amz-content-sha256;x-amz-date, Signature=bdc694f2faa22a4b1350cda72fd12c6250d35da0a1087aba7c16e5ad6292e561

As we lost the Expect header when computing the signature server side, we end up with an invalid signature and we reject the customer request.

There is an easy workaround though. As soon as we see the expect header present on the SignedHeaders, we can consider it was 100-continue value and we'll be able to compute the correct signature.

KingMob commented 1 year ago

Well, the good news is, afaik, expect has no other legitimate value, and probably won't ever get new ones.

The bad news, under HTTP/1, is that the value "100-continue" is case-insensitive, according to the RFC, so if S3 switches, or another "S3-compatible" service uses a different case, the signature would be off. Unlikely, but who knows?

More generally, Netty doesn't seem set up to preserve headers like this for computing signatures; it treats the removal of a header as a way of saying it's been processed, which is a problem for any header needed more than once.

A more general solution is to insert a user-supplied handler in the pipeline immediately after HttpServerCodec that can examine the headers and store a copy, if necessary (we wouldn't want to do this for all requests). Not sure if it's worth a config parameter (e.g., like :original-headers-handler), or we should suggest people use pipeline-transform for this niche case.

arnaudgeiser commented 1 year ago

The bad news, under HTTP/1, is that the value "100-continue" is case-insensitive, according to the RFC, so if S3 switches, or another "S3-compatible" service uses a different case, the signature would be off. Unlikely, but who knows?

Yes, the fact that expect got signed is already a niche case. If eventually, a new S3-compatible client shows up with an uppercase expect field, we might reconsider.

A more general solution is to insert a user-supplied handler in the pipeline immediately after HttpServerCodec that can examine the headers and store a copy, if necessary (we wouldn't want to do this for all requests). Not sure if it's worth a config parameter (e.g., like :original-headers-handler), or we should suggest people use pipeline-transform for this niche case.

Yes, it will be my way to go if it starts being an issue. But anyway, it will manifest as a special casing in the code since we'll have to keep a mapping between X-Pect and expect.

arnaudgeiser commented 1 year ago

The bad news, under HTTP/1, is that the value "100-continue" is case-insensitive, according to the RFC, so if S3 switches, or another "S3-compatible" service uses a different case, the signature would be off. Unlikely, but who knows?

Ok, we got bitten by that sooner than expected.

A more general solution is to insert a user-supplied handler in the pipeline immediately after HttpServerCodec that can examine the headers and store a copy, if necessary (we wouldn't want to do this for all requests). Not sure if it's worth a config parameter (e.g., like :original-headers-handler), or we should suggest people use pipeline-transform for this niche case.

So be it, thanks for the pointer.

KingMob commented 1 year ago

Sorry to hear that. Need any help with it?

Another possibility is to just remove the HttpServerExpectContinueHandler from the pipeline, and handle it all yourself. (Might be tricker if the HttpObjectAggregator is added to the pipeline, since that does way more, but also replies to Expect headers.)

Subclassing HttpServerExpectContinueHandler might be pretty easy too.

arnaudgeiser commented 1 year ago

I did implement a solution that works as expected, here is the main part.

public class ExpectHeaderHandler extends ChannelInboundHandlerAdapter {

    private static final String X_PECT_HEADER = "X-Pect";

    @Override
    public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
        if(msg instanceof HttpRequest req) {
            HttpHeaders headers = req.headers();
            String expectHeader = headers.get(HttpHeaderNames.EXPECT);
            if(expectHeader!=null) {
                headers.set(X_PECT_HEADER, expectHeader);
            }
        }
        super.channelRead(ctx, msg);
    }
}
(defn expect-header-handler [^ChannelPipeline pipeline]
  (.addBefore pipeline
              "continue-handler"
              "expect-header-handler"
              (ExpectHeaderHandler.)))

(http/start-server handler {:pipeline-transform expect-header-handler ...}

With that in place, I have then the :x-pect header available on the headers from the request.

Another possibility is to just remove the HttpServerExpectContinueHandler from the pipeline, and handle it all yourself. (Might be tricker if the HttpObjectAggregator is added to the pipeline, since that does way more, but also replies to Expect headers.)

We are not using HttpObjectAggregator, I think it might be cleaner if we remove the continue-handler from the pipeline and we just copy-paste the HttpServerExpectContinueHandler where we ditch the Expect header removal.

Sorry to hear that. Need any help with it?

So far so good, thank you for the pointers!

KingMob commented 1 year ago

Hmmm, it doesn't look like your solution handles case changes in the header name itself, though, just the value. Switching from "Expect" to "expect" may break. If the header name is proper-cased in HTTP/1, I would expect (pun intended) this to break as soon as you upgrade to HTTP/2+.

arnaudgeiser commented 1 year ago

It's fine, on the S3 specification, the header names are supposed to be in lowercase anyway. [1]

[1] : https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html