Service API makes request outcomes not observable to applications

Matthias247 commented 4 years ago

I did some exploration work with Hyper in the last weeks, and tried to model some typical application workflows with it.

Compared to other (web) frameworks, I run into a limitation that I would like to explain:

Hypers request handler has the signature:

async fn handler(req: Request<Body>) -> Result<Response<Body>, Error>

A handler will be invoked by the Hyper server when a new request was received, and contains the application code. However not all application code runs inside the scope of this function:

If the handler wants to send a response body back to the user, the handler barely returns "instructions on how to send the body", which are encoded in the Body type. The actual transmission of the Body will however be performed inside Hyper and outside of the request handler. While that doesn't seem to matter at first since Hyper will do a good job on driving the request to completion, it has some impact on the observability of the outcome of the request.

I will provide some examples here:

Logging the outcome of a request

A basic example requirement that most services will have is logging the outcome of a request - mostly in form a metric. This usually happens by doing doing something along:

async fn handler(req: Request<Body>) -> Result<Response<Body>, Error> {
    let result = do_all_the_business_logic().await;
    metrics.set("success", result.is_ok()); // 
}

or a try/catch/finally block for languages which support it:

Response handler(Request req) {
    try {
         return do_all_the_business_logic();
         metrics.set("success", true);
    } catch (Exception e) {
         metrics.set("success", false);
         throw e;
    }
}

This does not work out in Hypers current Service functions, since the outcome of sending the response is only known to Hyper. For small requests it might be easy to assume that it will mostly succeed, but for large bodies (GB sizes) there is actually a high probability the connection breaks down before everything was transmitted.

There might also be clients which forcefully abort the request, or issues in the HTTP stack that lead to a preemption. In all cases we want to track the correct result.

Logging the timing of a request

Another common requirement for request handlers is to log the time they needed to execute, in order to check for any issues in their implementation (or downstream services).

One easy way to do this if the handler contains all request logic is:

Response handler(Request req) {
    var startTime = Clock.now();
    try {
        return do_all_the_business_logic();
    } finally {
        metrics.set("time", startTime.elapsed());
    }
}

This is not possible using the current Hyper service API, since the requests processing ends outside of the scope of the applications request handler. If we would integrate timing like this, it would mostly measure the time-to-first-byte, but not time-to-last-byte.

Logging the amount of transmitted bytes

Another common requirement is to be able to log how many bytes of a response body had been passed into the socket, in order to be able to diagnose better what lead a request to be fail. Failures after lots of bytes had been transferred with a high speed typically have different root-causes than ones which occur immediately.

Since Hyper pulls bytes, the application can observe the send status inside the handler.

Logging of errors while sending the response body fails

Sending a response can fail for a variety of reasons, especially when big bodies are used:

The TCP connection could get disconnected
A higher level connection could get disconnected. E.g. a TLS connection could get terminated through an Alert. A HTTP/2 and HTTP/3 stream can get cancelled/reset.
A write timeout could occur.

We typically want to keep track of the error reasons in order to troubleshoot issues. This is however not possible using the current Hyper APIs, since the the error is only visible to Hyper and not forwarded to the application.

Workarounds

Before jumping into improvements, I want to mention that there exists a workaround which at least addresses the concerns partially:

Custom response stream

I think currently there exists a workaround for this, which brings some visibility back to the application: It could implement Body as a custom Stream, and e.g. emit metrics when the Stream is destructed. At this point the application knows that no more bytes will get sent.

If this happens before the Stream was read to completion, the application is aware that the request was cancelled.

For the request outcome, the application will get an approximation of the outcome. It knows whether Hyper pulled all bytes, but not whether sending the last chunk succeeded. This might be a bad approximation for small responses, since the complete response fits into a single chunk.

For latency metrics, it will also be an approximation, since the sending/flushing the last chunk of data would not be covered.

The exact transport error would however still be unknown to the application.

Improvement ideas

I think the best way to address these issues is by having an API which makes sure that the complete handling of the request occurs within the request handler.

This can e.g. be achieved by passing a type into the handler which allows to actively send the response, instead of returning the response. Go's HTTP Handler callback makes use of such a mechanisms:

type Handler interface {
    ServeHTTP(ResponseWriter, *Request)
}

func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request)

Here the handler would directly call w.Send(bodyChunk) until the request is finished. Since every send call returns an error (which can be derived from an error that a lower layer returned), the outcome of all writes will be observable to the handler.

And since the handler would not return until all data is sent adding metrics around success and timing gets easy in a central place again.

In Rust terminology, the type which needs to be passed to the handler would be a type which

allows to submit Response Headers once (or multiple times, if we allow informational headers)
implements either a Sink<Bytes/Buf> to stream the body through owned buffers or simply AsyncWrite to write slices
allows to submit response trailers once

Matthias247 commented 4 years ago

Additional patterns enabled through handling the whole request inside a handler

Besides observability being able to run the whole request inside the handler also enables some other patterns. One of those is limiting the number of concurrent requests on a handler.

In order to do this, we can implement a handler as:

void abc_handler(Request req, ResponseWriter res) {
    if (!abc_semaphore.try_acquire(1)) {
        res.sendHeaders(status: 500, body: "Limit exceeded");
        return;
    }

    try {
        return do_all_the_business_logic();
    } finally {
        abc_semaphore.release(1);
    }
}

Of course using guards/destructors and without finally blocks in Rust :-)

In order to implement the same functionality using the current handler signature, we would need to store the permit inside Body- in order to make sure it only gets released when the request run to completion.

malaire commented 3 years ago

I'd love to have this feature.

Another workaround that doesn't work is wrapping the TcpStream with custom stream - it seems impossible to know which poll_write is the last one. Even after body is dropped there can still be multiple successful poll_write calls before response has been fully handled.

malaire commented 3 years ago

More testing shows this to be completely wrong:

I think currently there exists a workaround for this, which brings some visibility back to the application: It could implement Body as a custom Stream, and e.g. emit metrics when the Stream is destructed. At this point the application knows that no more bytes will get sent.

Even with large 1 MB response body, the body will be dropped before even a single byte is written to TcpStream - all writes happen after body is dropped.

So I consider this suggested feature to be an essential feature for hyper because currently there is no way to get statistics like time-to-last-byte.

malaire commented 3 years ago

Related problem: When request handler is called, hyper has already read the request headers. This means that request handler doesn't see the real beginning of request handling and complicates trying to get bytes-in count for logging.

The API proposed here doesn't fix this problem.

gabotechs commented 1 year ago

I am facing this same problem currently, I managed to track the response stream with the following approach:

fn monitor_body(mut body: Body) -> Body {
    let (mut sender, new_body) = Body::channel();

    let f = || async move {
        let mut count = 0;
        while let Some(chunk) = body.data().await {
            if let Ok(data) = chunk {
                count += data.len();
                if let Err(_err) = sender.send_data(data).await {
                    return;
                }
                println!("{count} bytes transferred");
            } else {
                return sender.abort();
            }
        }
    };

    tokio::spawn(f());
    new_body
}

fn transform_response() {
    let mut res = Response::new(Body::empty());

    let (parts, mut body) = res.into_parts();
    body = monitor_body(body);
    res = Response::from_parts(parts, body);
}

It replaces the Body with a new one created from a channel, that way I can stream the contents of the original Body to the new one myself, and keep track of the usage there.

This still has some drawbacks though:

A tokio::spawn is needed. I don't know about the overhead of spawning tasks, but I doubt that it is completely free.
The example I showed is very simple, but if something else was passed to the monitor_body function, like a callback for tracking the bytes streamed, things get really messy with the lifetimes, as stuff would need to be 'static.

For me, tests are passing, but I wonder if this is a viable solution in the long term.

hyperium / hyper