omjadas / hudsucker

Intercepting HTTP/S proxy
https://crates.io/crates/hudsucker
Apache License 2.0
206 stars 35 forks source link

How to do something after streaming body? #43

Closed nlevitt closed 1 year ago

nlevitt commented 1 year ago

This is a noob question, not really an issue. I'm trying to stream a response and then do stuff after that finishes. Specifically (for now) I'm computing a sha256 of the response body, and need to call finalize(). Here's one attempt to do that: https://github.com/nlevitt/warcprox-rs/blob/master/src/main.rs You can run this with cargo run and in another terminal execute for example curl -k -gvsS --proxy http://127.0.0.1:8000 https://example.com/. The problem is that the None case in poll_next() is never called.

It would be even nicer to be be able to chain some sort of finally to the end of the body stream, but I'm not sure if that's possible. https://github.com/nlevitt/warcprox-rs/blob/stream-combinators/src/main.rs#L63

omjadas commented 1 year ago

What I imagine might be happening is that hyper is using the content-length header under the hood to determine when to stop polling the body. If you were to inspect the content-length header yourself and keep track of how many bytes have been streamed you should be able to determine when all the bytes have been sent.

nlevitt commented 1 year ago

Oh, that's a good idea. I can confirm that the None case is called when the response has no content-length header, tested with

curl -k -gvsS --proxy http://127.0.0.1:8000 https://httpbin.org/stream-bytes/10

Also figured out how to do a sort of finally combinator based on a suggestion someone made on discord:

        let body = Body::wrap_stream({
            let mut sha256 = Sha256::new();

            BodyStream(body)
                .map_ok(move |buf| {
                    info!("{:?}", buf);
                    sha256.update(&buf);
                    buf
                })
                .chain(once(async {
                    info!("finished the stream");
                    Ok(Bytes::new())
                }))
        });

Unfortunately this suffers from the same problem as poll_next. The chained stream never gets called if there's a content-length on the response.

Basing logic on the content-length seems like something better to avoid, if possible, because to be fully correct it's important to match hyper's logic exactly for corner cases, which is both error-prone and redundant.

So I had another idea that does seem to work in every case, implementing Drop on my stream class. https://github.com/nlevitt/warcprox-rs/blob/a0339eec/src/main.rs#L66

omjadas commented 1 year ago

You should also be able to delete the content-length header from the response in your handler, which will cause hyper to stream the body to the client. In this case, hyper should exhaust the stream, and you wouldn't require the custom Drop implementation.

nlevitt commented 1 year ago

I haven't tested this but I think that deleting the content-length header may cause problems with persistent connections (multiple requests on the same TCP connection). I'm happy with Drop for now. My impression is that using Drop is idiomatic rust.