net/http/httptrace: add ServerTrace hooks

joeshaw commented 7 years ago

Problem

A very common pattern in Go HTTP servers is to implement an http.ResponseWriter that wraps another http.ResponseWriter and captures the status code. This is often used for logging and metrics collection.

For example,

type statusCaptureWriter struct {
    http.ResponseWriter
    status int
}

func (scw *statusCaptureWriter) WriteHeader(status int) {
    scw.status = status
    scw.ResponseWriter.WriteHeader(status)
}

type loggedHandler struct {
    handler http.Handler
    logger SomeLogger
}

func (h *loggedHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    scw := &statusCaptureWriter{ResponseWriter: w}
    h.handler.ServeHTTP(scw, r)
    h.logger.Logf("status=%d ...", scw.status)
}

I've written something like this a bunch of times. You can find examples in nearly every Go web framework out there. One example is https://github.com/urfave/negroni/blob/master/response_writer.go#L13-L26.

There are some issues with this approach. For instance, my statusCaptureWriter doesn't implement other interfaces like http.Flusher, http.CloseNotifier or http.Pusher. I can't determine at compile time whether the underlying http.ResponseWriter implementation implements any of these interfaces, so if I choose to implement them I might lie to callers at higher levels of the stack and inadvertently break things. (This particularly a problem with CloseNotifier.)

Proposal (rejected, see below)

I'd like to propose an additional interface, http.Statuser (better name welcome) that exposes the status code within a http.ResponseWriter implementation. The internal http.(*response) implementation already tracks the status code written, so this can just be exposed and it will automatically implement this interface.

Software could avoid wrapping the http.ResponseWriter by instead type asserting it to http.Statuser and getting the status as needed there. (And it could optionally continue to wrap the ResponseWriter as needed until this is widely deployed.)

type loggedHandler struct {
    handler http.Handler
    logger SomeLogger
}

func (h *loggedHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    // Could check for http.Statuser implementation here and wrap http.ResponseWriter 
    // if necessary, but omitting for brevity

    h.handler.ServeHTTP(w, r)

    status := 0
    if s, ok := w.(http.Statuser); ok {
        status = s.Status()
    }

    h.logger.Logf("status=%d ...", status)
}

Alternative proposal

Implement an httptrace.ServerTrace struct that is analogous to the ClientTrace already there. See https://github.com/golang/go/issues/18997#issuecomment-279611928 for more info.

joeshaw commented 7 years ago

More examples:

https://github.com/gobuffalo/buffalo/blob/5b5e01469cfd064cad0a56b1eee4728a41d2ec46/response.go

https://github.com/gin-gonic/gin/blob/2dae550eb5392006a4582ce9c90016a9b5a74e8b/response_writer.go#L19-L48

https://github.com/goji/glogrus2/blob/master/writer_proxy.go

https://github.com/rikonor/go-status-code-tracker

bradfitz commented 7 years ago

These exist apparently, but why?

Code should know what it did previously.

joeshaw commented 7 years ago

@bradfitz The example I posted is a simplified instance: you create a wrapper that handles logging of all requests and then the individual handlers don't need to do or understand logging themselves. They are free to just call w.WriteHeader(http.StatusOK) and things get logged at a higher level. This plays nicely with middleware, too.

Other things that might get logged there are the URL/path, the duration of the call, etc. It would be a pain to have to do that at essentially every exit site of every middleware or handler. I think it's also a valid separation of concerns.

joeshaw commented 7 years ago

Another example beyond logging is collection of metrics. You need to know when you serve 5xx errors, for instance. Again, you don't necessarily want to be doing this everywhere it happens (and in the case of panics, I'm not sure offhand it's possible.)

bradfitz commented 7 years ago

Rather than a dozen optional methods, then, I'd prefer to see something like a State() method that returned a concrete struct type to which we could add fields over time.

gopherbot commented 7 years ago

CL https://golang.org/cl/36647 mentions this issue.

urandom commented 7 years ago

Given that the primary target of this feature is logging, I suggest that the length of the response body would be a good addition to the output of State()

bradfitz commented 7 years ago

What if we just provide one canonical ResponseWriter wrapper instead?

And if this is about logging, we should (re-)consider the HTTP logging issues/proposals.

joeshaw commented 7 years ago

@bradfitz the thing that worries me about providing a wrapper type is also wrapping the other interfaces that may optionally be supported by the underlying type. For instance, http.Pusher is something that's only supported by the http2 ResponseWriters. This wrapper type either has to eschew support for Pusher or it has to "lie" about supporting it.

tombergan commented 7 years ago

I've written a wrapper type like this as well, except instead of just logging the response code, my wrapper type also logged the number of response body bytes written, an md5sum of the response body, the time between the first and last Write call, and a few other miscellaneous things. Skimming through the examples in this comment, I see that I'm not alone -- many of the wrappers provide more than just the response code. I worry that adding types like http.Statuser is an endless game of whack-a-mole. Even with Brad's State() idea, I wouldn't expect the http library to compute an md5sum of the response body.

@joeshaw, have you considered designing a server-side version of httptrace? We considered this at one time, but never worked through the details. Also see #18095 and #3344.

These exist apparently, but why? Code should know what it did previously.

It is convenient to put all logging code in one place. One way to do this is by wrapping the http.ResponseWriter. This approach becomes more compelling when the logger tracks a bunch of things such as timing info or a hash of the response body, rather than just the response code (which is easy to remember by itself).

joeshaw commented 7 years ago

@tombergan i hadn't considered an httptrace-equivalent, but perhaps that is a better way to go in order to support different use cases like hashing response bodies.

gopherbot commented 7 years ago

CL https://golang.org/cl/37010 mentions this issue.

joeshaw commented 7 years ago

I just uploaded a CL that has the very start of this by introducing an httputil.ServerTrace struct. At the moment it only implements a WroteHeader hook. It'd be nice if people could take a look and give an opinion on the general approach.

I expect to add StartHandler and EndHandler hooks as well. Since it's implemented in the server, it could contain connection-level hooks as well.

One things that's a little strange about it is that the data isn't accessible from the handlers, which I imagine could be a problem if logging code wanted to include additional request-specific data. One possibility might be to pass the http.Request into the hooks, have handlers add values to the req.Context and have an EndHandler hook extract and log there?

tombergan commented 7 years ago

I think you're right that the server has both connection-level events and request-level events, and those may need to be handled separately. We could start by supporting request-level events only, since that seems to be what is wanted the most.

A nice thing about ClientTrace is that it's easy to create a new ClientTrace struct for each request. For example, here's how I use ClientTrace:

info := &reqInfo{}
req = req.WithContext(httptrace.WithClientTrace(req.Context(), &httptrace.ClientTrace{
  GotConn: info.GotConn,
  ...
}))
transport.RoundTrip(req)
...
log(info)

This lets me accumulate all info about a single request into a single struct (reqInfo). It would be ideal if I could a similar pattern for server traces. I think we could do this be changing your proposal slightly:

packge http
type Server struct {
  ...
  NewTrace func() *httptrace.ServerTrace   // optional callback to create a trace for a new request
}

One things that's a little strange about it is that the data isn't accessible from the handlers, which I imagine could be a problem if logging code wanted to include additional request-specific data. One possibility might be to pass the http.Request into the hooks, have handlers add values to the req.Context and have an EndHandler hook extract and log there?

That would work. Another idea is to add an opaquely-typed field to ServerTrace that can be set by NewTrace. Perhaps UserData interface{}. Then, the request handler could extract the trace info using httptrace.ContextServerTrace(req.Context()).

Merovius commented 7 years ago

@joeshaw It is theoretically possible to dynamically decide, whether to wrap pusher or not:

func wrap(wrap, with http.ResponseWriter) http.ResponseWriter {
    if p, ok := wrap.(http.Pusher); ok {
        return struct {
            http.ResponseWriter
            http.Pusher
        }{ with, p }
    }
    return with
}

The trouble is a) the combinatoric explosion. You have to provide a branch for every combination of interface that could exist. And b) that the wrapping code must know about all possible optional interfaces. You can get around these by providing generic wrapping-code, but that also only solves the issue for all the optional interfaces in net/http itself, not potential third-party packages using the same trick.

This is why I'm so troubled by "optional interfaces" and the fact that they are used in net/http so pervasively. It's a very leaky abstraction and sets a bad example.

joeshaw commented 7 years ago

@Merovius wow! that is a very interesting hack, and one i hadn't ever considered. I agree that it's not scalable, but thanks for pointing that out.

bradfitz commented 7 years ago

@tombergan,

Then, the request handler could extract the trace info using httptrace.ContextServerTrace(req.Context())

Putting things on the context might be interesting. I don't know whether that crosses the line of acceptable uses of context, though. I'm fine with it, but I think others might balk. I think it's better than more optional interfaces at least.

rsc commented 7 years ago

Context makes some sense for client tracing, in that the client causes it to trace all requests being made, but that's arguably OK. On the server, though, you want to scope it to one request, not all the requests being handled by the server. It's unclear that context can be made to work here.

It sounds like there's not a concrete proposal here at this point and that someone needs to write a design doc that incorporates the feedback and feature discussion above.

Or we can move this issue out of the proposal process and leave it open as a regular issue until someone does want to write a proposal.

joeshaw commented 7 years ago

I think it makes sense to move it out of the proposal process for now, since my original proposal wasn't well-liked.

Using httptrace sounds like a better way to go, but I think more prototyping needs to be done before it's ready to be a formal proposal.

tombergan commented 7 years ago

On the client, we can trace a single request like this:

req = req.WithContext(httptrace.WithClientTrace(req.Context(), &httptrace.ClientTrace{...}))
transport.RoundTrip(req)

On the server, we already create a context for every request. We could add a server hook into that context creation point so that callers can inject a trace object. Something like:

packge http
type Server struct {
  ...
  // A better name is needed.
  UpdateRequestContext func(ctx context.Context) context.Context
}

// server.go:972
ctx, cancelCtx := context.WithCancel(ctx)
if c.server.UpdateRequestContext != nil {
  ctx = c.server.UpdateRequestContext(ctx)
}

package main
...
&http.Server{
  UpdateRequestContext: func(ctx context.Context) context.Context {
    return httptrace.WithServerTrace(req.Context(), &httptrace.ServerTrace{...}))
  },
}

That feels like a direct mirror of how httptrace is used on the client. In practice we may need to move the context creation point into readRequest so that httptrace can capture errors reading the request, but that's an implementation detail.

As a side-effect, this change would allow callers to arbitrarily modify the ctx used for each request. I haven't thought deeply about whether or not this is a bad idea. In practice, I don't imagine it would be useful beyond httptrace, although I'm sure it could be abused.

I also haven't thought much about which methods we'd like on httptrace.ServerTrace -- that's a good question for @joeshaw and others who are hoping to use this feature.

bradfitz commented 7 years ago

As a side-effect, this change would allow callers to arbitrarily modify the ctx used for each request.

There was actually a very similar proposal to add such a set-the-initial-context hook earlier, but it didn't go in because we weren't quite sure what the use case would be. But this seems like a good use case.

joeshaw commented 7 years ago

Maybe we should limit the ability to modify the context by doing that behind the scenes in the server code, and have the hook on the http.Server simply return an httptrace.ServerTrace value.

type Server struct {
    // ...
    ServerTrace() *httptrace.ServerTrace
}

// server.go
ctx, cancelCtx := context.WithCancel(ctx)
if c.server.ServerTrace != nil {
    ctx = httptrace.WithServerTrace(ctx, c.server.ServerTrace()) // do we even need to export WithServerTrace?
}

And inside your handler you could still extract the trace from req.Context with a helper function in httptrace if you needed it.

Edit: oh, heh, I see that was your recommendation above, Tom. I think I like the idea of setting a trace more than setting a context value. It's more explicit and doesn't preclude the ability to set an initial context in a later revision.

tombergan commented 7 years ago

I suggested that earlier, but forgot that http cannot import httptrace, so that doesn't work.

tmthrgd commented 7 years ago

The canonical wrapper mentioned by @bradfitz in https://github.com/golang/go/issues/18997#issuecomment-279522713 is actually impossible to do reliably. Not only do you need to consider a host of different combinations, but you also need to check whether the ResponseWriter that you want to wrap already implements those methods. I've had to wrap ResponseWriter's five times so far and the logic can vary depending on the different cases, although there is certainly some overlap. See: 1, 2, 3, 4, and 5. @joeshaw if you restrict yourself to the standard library, the number of combinations isn't too bad; it does, somewhat, scale.

The idea of a httptrace.ServerTrace sounds like a really good solution to this. As for what fields it should capture, the only information I want, so far, is the status code and the response size.

joeshaw commented 7 years ago

How should we best track response size? I don't think we want to expose every write to a ServerTrace call, so maybe it should be gathered up inside the net/http machinery instead.

tombergan commented 7 years ago

I don't think we want to expose every write to a ServerTrace call

Why not? What if someone wants to scan the response body is some way other than computing the size, such as computing a hash of the response body or tee'ing the response body to a log? Hooking each Write call seems like the most straightforward way to achieve this although I'd be open to other ideas. There should be nearly zero overhead when a Write hook is not installed (just a branch) or very low overhead when a hook is installed (branch + indirect function call).

How about something like:

type ServerTrace struct {
  // Received a bad request (e.g., see errToLarge in net/http/server.go).
  // The ServeHTTP handler will not be called.
  // BadRequestInfo has the status code of the response (the current implementation
  // can return 431 or 400) and perhaps also the response body, which is an error string.
  // This addresses https://github.com/golang/go/issues/18095
  GotBadRequest(BadRequestInfo)

  // Called when receiving a request, just before calling the ServeHTTP handler.
  // RequestInfo would likely include the URL and Headers of the request (with caveats
  // about not mutating those values).
  // This would satisfy https://github.com/golang/go/issues/3344 -- see the linked camlistore code.
  GotRequest(RequestInfo)

  // Called when the handler calls WriteHeader.
  // WriteHeaderInfo includes the status and maybe also the headers (with caveats about
  // not mutating the headers). Or perhaps this is (status, headers) instead of WroteHeaderInfo.
  // This addresses the current bug.
  WroteHeader(WroteHeaderInfo)

  // Called each time the handler calls Write. This is the data fed to the ResponseWriter,
  // e.g., before any transfer encoding. Includes the return values of the Write call.
  // Caveats about mutating data.
  // This addresses the current bug.
  WroteBodyChunk(data []byte, n int, err error)

  // TODO: WroteTrailers?

  // Called when the ServeHTTP handler exits.
  HandlerDone(panicked bool /* maybe? or part of HandlerDoneInfo? */)
}

tombergan commented 7 years ago

I think we're converging on an actual proposal. https://github.com/golang/go/issues/18997#issuecomment-292039169 proposes how to create a ServeTrace object for each request received by the server, and https://github.com/golang/go/issues/18997#issuecomment-296722404 proposes a set of hooks to resolve this bug and two others.

@bradfitz, do you want a more formal proposal or are the above two comments sufficient? I think @joeshaw volunteered to help with the implementation.

bradfitz commented 7 years ago

SGTM at a high level. I might tweak certain names/docs/signatures, but not much. We can discuss that during the code review too.

nvartolomei commented 7 years ago

👍

https://github.com/felixge/httpsnoop/blob/master/wrap_generated_gteq_1.8.go https://github.com/prometheus/client_golang/blob/master/prometheus/promhttp/delegator.go

This needs to stop.

tombergan commented 7 years ago

@joeshaw Are you still interested in helping with the implementation? We could start reviewing the implementation at any time, although we shouldn't submit anything until the Go 1.10 window opens (in August I believe).

joeshaw commented 7 years ago

@tombergan I am at GopherCon this week and hoping to spend some time on this.

joeshaw commented 7 years ago

http cannot import httptrace, so that doesn't work.

Why not? Other than example_test.go, nothing in httptrace imports http so a circular dependency isn't an issue.

I could see us wanting to make the http.Request (perhaps a copy of it?) available in the GotRequest trace hook, though.

bradfitz commented 7 years ago

net/http already imports httptrace.

joeshaw commented 7 years ago

Here's a sketch of how you might use it in practice:

type requestInfo struct {
    start time.Time
    path string
    statusCode int
    h hash.Hash
}

s := &http.Server{
    UpdateRequestContext: traceRequest,
}

func traceRequest(ctx context.Context) context.Context {
    info := requestInfo{
        h: sha1.New()
    }

    trace := &httptrace.ServerTrace{
        GotRequest: func(gri httptrace.GotRequestInfo) {
            info.start = time.Now()
            info.path = gri.URL.Path
        },
        WroteHeader: func(whi httptrace.WroteHeaderInfo) {
            info.statusCode = whi.StatusCode
        },
        WroteBodyChunk: func(data []byte, n int, err error) {
            if err != nil {
                // ???
            }

            info.h.Write(data)
        },
        HandlerDone: func(hdi HandlerDoneInfo) {
            log.Printf(
                "path=%s status=%d duration=%s hash=%x",
                info.path,
                info.status,
                time.Since(info.start),
                info.h.Sum(nil),
            )
        },
    }

    return httptrace.WithServerTrace(ctx, trace)
}

I'm still not 100% convinced that tying this in with an UpdateRequestContext function is the right way to go, especially if we can't come up with other circumstances to change the request context. A ServerTrace-specific function on the Server seems cleaner to me.

nvartolomei commented 7 years ago

For my specific use-case(s), the trace would be added by a middleware. Actually, multiple tracing hooks by different middlewares.

Unfortunately this will leave outside GotBadRequest and GotRequest hooks proposed by @tombergan.

joeshaw commented 7 years ago

After discussing use cases last night, I've flip-flopped on using context for that. And that's because you might want to inject things into the context and use them both in the server trace hooks and middleware/handlers. Building on my example from earlier:

type requestInfo struct {
    start time.Time
    path string
    statusCode int
    h hash.Hash
    size int64
}

s := &http.Server{
    UpdateRequestContext: traceRequest,
}

func traceRequest(ctx context.Context) context.Context {
    reqID := generateRequestID()
    ctx = withRequestID(ctx, reqID)

    info := requestInfo{
        h: sha1.New()
    }

    trace := &httptrace.ServerTrace{
        GotRequest: func(gri httptrace.GotRequestInfo) {
            info.start = time.Now()
            info.path = gri.URL.Path
        },
        WroteHeader: func(whi httptrace.WroteHeaderInfo) {
            info.statusCode = whi.StatusCode
        },
        WroteBodyChunk: func(data []byte, n int, err error) {
            if err != nil {
                // ???
            }

            info.h.Write(data)
            info.size += n
        },
        HandlerDone: func(hdi httptrace.HandlerDoneInfo) {
            log.Printf(
                "req-id=%s path=%s status=%d duration=%s size=%d hash=%x",
                reqID,
                info.path,
                info.status,
                time.Since(info.start),
                info.size,
                info.h.Sum(nil),
            )
        },
    }

    return httptrace.WithServerTrace(ctx, trace)
}

This way the request ID is not only logged, but it's passed into the context so handlers can extract it and log it, pass it to other services, etc.

joeshaw commented 7 years ago

For my specific use-case(s), the trace would be added by a middleware. Actually, multiple tracing hooks by different middlewares.

Could you elaborate a little more on this? Are you thinking that the trace would be added in the execution of middleware? Or that setting up the middleware would also attach something to the server?

The reason why I ask is that this hook function is on the http.Server and called only once per request -- it won't be possible to attach to it in the middle of middleware execution. For things that the middleware might want to attach, using the provided http.Request WithContext() method is a better approach.

tombergan commented 7 years ago

http cannot import httptrace, so that doesn't work. Why not? Other than example_test.go, nothing in httptrace imports http so a circular dependency isn't an issue.

You're right. I have no idea what I was thinking. Looking forward to the CL!

pteichman commented 7 years ago

Now that we're looking at a partial implementation, I'm not sure that getting all three of data []byte, n int, err error is the right move for WroteBodyChunk. I think that getting both len(byte) and n is confusing in cases where they differ, and it makes it harder to correctly write the two use cases we're considering:

1) Compute a hash of the response body (would need to slice byte[:n]). 2) Calculate the response length (would be a mistake to look at byte).

So maybe what WroteBodyChunk should receive is the slice of bytes that was successfully written.

That made me wonder whether it also needs to get the write error, especially since that error is already being delivered to the application. I feel like it's really weird to get the same error twice. Countering that, @joeshaw suggests you might want to (as one example) do something different with the response hash if the connection was closed by the client.

joeshaw commented 7 years ago

That made me wonder whether it also needs to get the write error, especially since that error is already being delivered to the application. I feel like it's really weird to get the same error twice. Countering that, @joeshaw suggests you might want to (as one example) do something different with the response hash if the connection was closed by the client.

Just to clarify, I think checking for an error is helpful in that you could, for example, not log a confusing partial hash value. But I think the most value in getting the write error is to give one the ability to log it in the context of the rest of the request info.

pteichman commented 7 years ago

Panicked is also really appealing in HandlerDone but given that the handler recover() is scoped to the whole connection it doesn't really make sense unless we add a recover() to each ServeHTTP call.. And the application doesn't need any trace support to do that already.

tombergan commented 7 years ago

So maybe what WroteBodyChunk should receive is the slice of bytes that was successfully written.

This SGTM.

given that the handler recover() is scoped to the whole connection

This is not entirely true. In HTTP/2, the recover() wraps ServeHTTP: https://github.com/golang/net/blob/master/http2/server.go#L2037

In HTTP/1, recover() does wrap the connection, but only because of the nature of HTTP/1 -- it's impossible to reuse a connection in HTTP/1 unless the prior request/response was completed exactly, but after a panic, we don't know the state of the connection and must assume it cannot be reused. https://github.com/golang/go/blob/master/src/net/http/server.go#L1690

That said, the godoc for Handler actually says that a panic will hang up the connection, which is not what's implemented in HTTP/2. Perhaps a godoc update is in order? https://golang.org/pkg/net/http/#Handler

/cc @bradfitz

bradfitz commented 7 years ago

Sent you CL 48551 for doc update.

gopherbot commented 7 years ago

CL https://golang.org/cl/48551 mentions this issue.

rhysh commented 7 years ago

I have a counterproposal that might resolve some of the root issues.

It sounds like the necessary information can be obtained by wrapping the http.ResponseWriter .. the problem is that wrapping http.ResponseWriter implementations is impractical because new features are exposed by adding methods to the underlying value, so a particular http.ResponseWriter might additionally implement http.Flusher or http.Pusher. A wrapper would need to implement exactly the right set of methods at compile time, since the success or failure of the type assertion is often the only way to know if a particular feature (like flushing) is supported. Furthermore, these wrappers need to be kept up to date with changes to Go to support new features (like http/2 push).

My proposal is to add one last method, which if present will return a struct. The struct would have a field for each optional feature, which could be left nil or set to a function value.

package http // import "net/http"

type ResponseOptionser interface {
  ResponseOptions() *ResponseOptions
}

type ResponseOptions struct {
  CloseNotify func() <-chan bool
  Flush       func()
  Hijack      func() (net.Conn, *bufio.ReadWriter, error)
  Push        func(target string, opts *PushOptions) error
  WriterTo    io.WriterTo
}

This would not allow adding new behaviors outside of net/http, since each feature would need a field on this struct. Is it common to add behaviors in non-core packages?

An alternative that allows for that is to attach the new behaviors as Values on the http.Request's Context, in the same way that *http.Server is currently available:

func myHandler(w http.ResponseWriter, r *http.Request) {
  f, ok := r.Context().Value(http.FlusherContextKey).(http.Flusher)
  if ok {
    f.Flush()
  }
}

Experience report:

I maintain a closed-source package for distributed request tracing that works for Go http clients and servers. I've implemented it as a http.Handler wrapper that in turn wraps the http.ResponseWriter.

It reads some of the inbound http headers to get the request id and places that in the http.Request's Context. When calling the underlying http.Handler, it passes a copy of the *http.Request that has a Context including some additional Values. As soon as its ServeHTTP method is called, it checks the current time and emits an instrumentation event including the timestamp, request id, and portions of the url and headers. When an http.Handler panics, it prints the stack trace with the request id prefixed to each line. On a call to WriteHeader (or the first call to Write), it emits an instrumentation event including the status code. It also counts the number of bytes sent in the response (measured by the arguments to calls to Write).

It used to do a bunch more, before Go 1.7 included context and before Go 1.8 fixed the lifespan of the Context to match the request (vs matching the lifespan of the call to ServeHTTP).

joeshaw commented 7 years ago

@rhysh This is an interesting proposal. There are a few things I would tweak about it, like returning interface values for the individual items in the ResponseOptions struct. (ie, Flush func() would be Flusher http.Flusher, but in both cases the values would be nil if the interface weren't implemented.)

There are a few things I like about it:

It makes it explicit what interfaces the ResponseWriter implementation supports.
ResponseWriter wrappers need only to implement the ResponseWriter and ResponseOptionser interfaces.

The things I am not sure how I feel about:

It does not address the very widespread use of ResponseWriter wrappers for common problems like storing status codes, calculating response sizes, or handler durations. (More on this below.)
It effectively deprecates type asserting the ResponseWriter to other interfaces, except ResponseOptionser. In other words, in a handler it's no longer sufficient to do:

if p, ok := rw.(http.Pusher); ok {
    p.Push(...)
}

which is a very standard, idiomatic thing to do. (It would still succeed much of the time, since the underlying private ResponseWriter implementation would support it, but wrappers would not.)

Instead, you have to do:

if ro, ok := rw.(http.ResponseOptionser); ok {
    if ro.ResponseOptions().Pusher != nil {
        ro.ResponseOptions().Pusher.Push(...)
    }
}

which is a little longer and less natural.

Back to the proliferation of ResponseWriter wrappers. Is this a real problem?

It's likely impossible to eliminate the need for ResponseWriter wrappers entirely in all cases. But the vast majority of the duplication out there are to track status codes and, to a lesser extent, things like timings and response body sizes. Perhaps the standard library should provide these in a new State method and interface as @bradfitz first suggested?

This counterproposal doesn't address the proliferation of wrappers, whereas the ServerTrace one likely would.

Unless people are vehemently opposed to this, I may take a stab at it.

joeshaw commented 7 years ago

Back to the ServerTrace implementation, the separation of http and httptrace packages is really making the usability of ServerTrace ugly because of circular dependencies. GotRequest would really benefit from having access to http.Request, but instead most of the fields on it have to duplicated. Same for http.Header. It makes me worried about the long-term viability of doing it this way.

To see @pteichman's and my progress on ServerTrace thus far: https://github.com/pteichman/go/tree/servertrace

Merovius commented 7 years ago

An alternative would be, to enable reflect to dynamically create methods (via #15924 and #16522). That way, people who want to create wrappers with optional interfaces could do so, e.g. by

func wrap(res http.ResponseWriter, statusCallback func(int)) http.ResponseWriter {
    st := reflect.StructOf{
        reflect.StructField{
            Type: reflect.TypeOf(res),
            Anonymous: true,
        },
    }
    nt := reflect.NamedOf(st, "wrapped", []reflect.Method{
        Name: "WriteHeader",
        Type: reflect.TypeOf(func(interface{}, int) {}),
        Func: reflect.ValueOf(func(_ interface{}, code int) { statusCallback(code) }),
    }
    v := nt.New()
    v.Field(0).Set(reflect.ValueOf(st))
    return v.Interface().(http.ResponseWriter)
}

Not terribly ergonomic (and I papered over the detail of how NamedOf would work and what the argument of that function would be, given that the struct-type is dynamically created); but it would work.

tombergan commented 7 years ago

Stepping back a bit, the reason we originally added httptrace.ClientTrace was to expose info that could not be tracked by simply wrapping a RoundTripper. If we focus solely on better ways to wrap a ResponseWriter in this issue, then we lose sight of exposing info that is not available to ResponseWriter. For example, see #18095, which was dup'd into this issue. That issue is about logging bad requests, which we cannot do with ResponseWriter wrappers because bad requests are never forwarded to request handlers in the first place. Further, we might imagine adding HTTP/2 specific hooks in the future, or hooks to get the exact headers written on the wire, which can change slightly compared to what the caller passes to WriteHeader.

There is also #16220, which was not very well motivated in isolation, but would be fixed as a side-effect of adding UpdateRequestContext.

There is also #20956, which is yet another example of people trying to get at the underlying net.Conn (especially see this comment).

It used to do a bunch more, before Go 1.7 included context and before Go 1.8 fixed the lifespan of the Context to match the request (vs matching the lifespan of the call to ServeHTTP).

@rhysh, can you say more about this? It sounds like you're currently unable to do something that you'd like to do as a result of the Context change.

But the vast majority of the duplication out there are to track status codes and, to a lesser extent, things like timings and response body sizes. Perhaps the standard library should provide these in a new State method and interface as @bradfitz first suggested?

I am strongly opposed to this. There is no way the standard library will cover all kinds of stats that every user wants to track. Rather than getting ourselves into the business of tracking stats, we should provide APIs that allow users to track whatever stats they need.

rhysh commented 7 years ago

can you say more about this? It sounds like you're currently unable to do something that you'd like to do as a result of the Context change.

On the contrary: Many of the features that had to be implemented from scratch by that internal package are now done by the standard library. For instance it used to use http.CloseNotifier to learn that the client had disconnected prematurely; as of Go 1.8, the request-scoped Context now provides that signal.

As of Go 1.8, Context in net/http has been working well. The focus of the internal package has narrowed, thanks to improvements in the standard library.

httptrace provides a lot of information about the request lifecycle and very little of it overlaps with information available to http.RoundTripper implementations.

The partial overlap I've encountered is that my internal package would additionally wrap the request body to learn when the full request had been written. This forced chunked transfers, which unfortunately didn't work with HTTP/1.0 servers, but otherwise worked well enough. After the addition of httptrace, I was able to use the WroteRequest hook to get that signal.

Because there's so little overlap, what are the problem statements that this issue should address? Here are some possibilities:

It's not possible [for *http.Server / http.Handler code] to learn about invalid/malformed requests
It's not possible to measure how long it took to read the request headers, from first byte to full headers received
It's not possible to measure how long a client's connection was idle before it sent the first byte of headers, or to determine if the connection was reused from a previous (or concurrent http/2) request.
Measuring the status code returned by a handler is inconvenient
Measuring the wall clock time at which the status code was set is inconvenient
Writing code to measure the status code returned by a handler is likely to break features like http/2 push or sendfile (for http/1.1).

The first three seem like they'd be addressed well with a server-side equivalent to the current httptrace.ClientTrace hooks. The later three can be done by wrapping http.ResponseWriter values as middleware, but it's hard currently too hard to do well.

golang / go