Open joeshaw opened 7 years ago
That is a nice summary. I think all six of your listed use cases could be handled by httptrace.ServerTrace
. For 4 & 5, we'd pass through the status code (and possibly headers) from the ResponseWriter
to the ServerTrace
. 6 should "just work" since ServerTrace
doesn't wrap the ResponseWriter
.
I think all six of your listed use cases could be handled by
httptrace.ServerTrace
It could do any of them. Where should it focus? Where should the discussion focus?
httptrace.ClientTrace
doesn't allow getting the status code; that's trivial to do in a http.RoundTripper
. But it also doesn't allow getting the sizes of the request body or the response body—both possible, but inconvenient, to get from a http.RoundTripper
wrapper.
ClientTrace
focused on giving access to net/http internals that it alone could provide.
In the current implementation draft of ServerTrace
, the GotBadRequest
hook gives access to information that would not otherwise be available to http.Handler
authors. This looks like a good way to give access to that information.
The WroteHeader
and WroteBodyChunk
callbacks give access to data that http.Handler
authors already have. It's not very convenient access because of the pattern of adding secret bonus methods on http.ResponseWriter
s, but it's access.
Should ServerTrace
have a callback for when Flush
is called and for when http/2 push is used? If sendfile(2)
is activated, should a hook be called at the beginning/middle/end of that transfer?
I think it would be helpful to split the discussion into 1) information present in the internals of net/http that is not exposed to users, and 2) information that users have, but need better access to.
Use of the http.Server.UpdateRequestContext
field looks clunky. It's called before the http.Request
is available, meaning that users would need to coordinate between the code that runs during that hook and a top-level http.Handler
in order to match the ServerTrace
hooks to an http.Request
.
Since it's attached to the http.Server
, it can't be used by middlewares. It's set once at the top level by the owner of the service, making it inaccessible to the maintainers of middleware packages. They're likely to continue wrapping http.ResponseWriters
, since that would be the only way the code they control is able to access the data they need.
I have a proposal for an alternate design, if we split the functionality more or less into the two types of data I identified above:
1) A new field on http.Server
points to a struct that contains the GotBadRequest
hook (and maybe some others). Consider this logically as a structured extension of the http.Server.ErrorLog
field.
2) A function in net/http that takes the Context
value from a http.Request
(or a *http.Request
directly) and something like a httptrace.ServerTrace
struct, registering it for callbacks as the request is served. Any layer of http.Handler
middleware could call the function, just as any layer of http.RoundTripper
middleware can with ClientTrace
.
(The function wouldn't return a Context
like the ClientTrace
equivalent does; the http.Server
implementation would use a private context key to store and access a function of type func(httptrace.ServerTrace)
. The net/http package's function would extract and call it.)
[
UpdateRequestContext
is] called before the http.Request is available, meaning that users would need to coordinate between the code that runs during that hook and a top-level http.Handler in order to match the ServerTrace hooks to an http.Request.
Not sure I followed? The hooks will be available via http.Request.Context()
. A pattern like the following should work:
s := &http.Server{
UpdateRequestContext: func(ctx context.Context) context.Context {
sh := newServerHooks() // a user-defined struct
ctx = context.WithValue(ctx, key, sh)
return httptrace.WithServerTrace(ctx, &httptrace.ServerTrace{
GotRequest: sh.GotRequest,
...
})
}
The handler can then use ctx.Value(key)
to extract sh
.
Since [UpdateRequestContext is] attached to the http.Server, it can't be used by middlewares. It's set once at the top level by the owner of the service, making it inaccessible to the maintainers of middleware packages.
The middleware needs some entry point in order to apply its http.ResponseWriter
wrapper. Couldn't the middleware use a similar entry point to install its UpdateRequestContext
wrapper? I am admittedly not familiar with many such libraries (aside from the few I've written myself for internal use), so I am very interested in hearing about libraries where UpdateRequestContext
will be difficult to use.
A function in net/http that takes the Context value from a http.Request ...
Can you explain this with a pseudocode example? I did not follow.
A new field on http.Server points to a struct that contains the GotBadRequest hook ... I think it would be helpful to split the discussion into 1) information present in the internals of net/http that is not exposed to users, and 2) information that users have, but need better access to.
That is a useful distinction for this discussion, however, my main concern is API clutter. I would like to minimize the API surface and avoid cluttering http.Server with callback fields, if possible. I fear we are heading towards API clutter unless the new functions/callbacks/fields we add are sufficiently general.
I am promoting two APIs: (1) Wrapping http.ResponseWriter. This will be the canonical way to affect how the response is produced. (2) UpdateRequestContext. This will be the canonical way to get a read-only view of each request for logging. There may need to be a long list of read-only hooks, but at least those hooks will be hidden in net/http/httptrace rather than cluttered in net/http.
Considering the case where a user wants to log the status code of all responses, here are three possible approaches:
1) No changes to net/http
. Logging package authors wrap the http.ResponseWriter
, providing a function with signature func(inner http.Handler) http.Handler
. Application owners can call it any time they have a http.Handler
, sometime before setting their http.Server.Handler
field.
2) Add a field to http.Server
, UpdateRequestContext
. Logging package authors write two pieces of code: their usual middleware (to access the URI path from the http.Request
), and a function to add to the UpdateRequestContext
field (to access the status code). Application owners call one of the functions from their http.Handler
, and the other where they set up the http.Server
(which may be in a different file or package).
3) Add a function in http
designed to be called from an http.Handler
. Logging package authors convert their http.ResponseWriter
wrappers to use the new function with no change in the API they provide to users. Applications owners can call it any time they have a http.Handler
, sometime before setting their http.Server.Handler
field.
The shortcoming of (1) is that it blocks access to additional methods on the http.ResponseWriter
. The shortcoming of (2) is that it requires users to coordinate between two different parts of the net/http
API. Users who only have access to an http.Handler
cannot use it (and would be forced to use the existing option 1).
Code for each follows. Does example 2 match the API design you have in mind? Can you give a more concise demonstration of correct use of the new field?
1) No changes to net/http
:
type statusRecorder struct {
status int
w http.ResponseWriter
}
func (srw *statusRecorder) Header() Header {
return srw.w.Header()
}
func (srw *statusRecorder) Write(p []byte) (int, error) {
if srw.status == 0 {
srw.status = http.StatusOK
}
return srw.w.Write(p)
}
func (srw *statusRecorder) WriteHeader(status int) {
srw.status = status
srw.w.WriteHeader(status)
}
func logStatusMiddleware_ResponseWriter(inner http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
srw := &statusRecorder{w: w}
defer func() {
log.Printf("path=%q status=%d", r.URL.Path, srw.status)
}()
inner.ServeHTTP(srw, r)
})
}
2) Add a field to http.Server
, UpdateRequestContext
:
func logStatusMiddleware_AndServerUpdate(inner http.Handler, srv *http.Server) http.Handler {
ctxKey := new(int)
prevFn := srv.UpdateRequestContext
newFn := func(ctx context.Context) context.Context {
if prevFn != nil {
ctx = prevFn(ctx)
}
var (
mu sync.Mutex
info httptrace.WroteHeaderInfo
)
ctx = httptrace.WithServerTrace(ctx, &httptrace.ServerTrace{
WroteHeader: func(i httptrace.WroteHeaderInfo) {
mu.Lock()
defer mu.Unlock()
info = i
},
})
ctx = context.WithValue(ctx, ctxKey, func() httptrace.WroteHeaderInfo {
mu.Lock()
defer mu.Unlock()
return info
})
}
srv.UpdateRequestContext = newFn
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
defer func() {
info := r.Context().Value(ctxKey).(func() httptrace.WroteHeaderInfo)()
log.Printf("path=%q status=%d", r.URL.Path, info.StatusCode)
}()
inner.ServeHTTP(w, r)
})
}
3) Add a function in http
designed to be called from an http.Handler
:
func logStatusMiddleware(inner http.Handler) http.Handler {
return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
http.AddServerTrace(r.Context(), &httptrace.ServerTrace{
WroteHeader: func(info httptrace.WroteHeaderInfo) {
log.Printf("path=%q status=%d", r.URL.Path, info.StatusCode)
},
})
inner.ServeHTTP(w, r)
})
}
If I'm reading this correctly, ServerTrace proposal doesn't allow modification of the response body prior to writing. Therefore it will not allow things like compressing the response body prior to it being sent, for example.
Presumably, the ServerTrace struct will need to define another handler: WillWriteBody(body []byte, header http.Header) []byte
Such a handler would need to receive the body before it's being written, as well as the headers, if there's a need to modify them. And it should return the new body for actual writing. I assume that the request headers are already somehow available from within that handler already, since they also usually play a part here.
You make a good point @urandom , that packages like https://github.com/NYTimes/gziphandler are currently implemented by wrapping the ResponseWriter with special behavior for Write and Flush. If packages like that one continue to wrap the ResponseWriter, new features exposed as methods will continue to be shadowed.
That package also has special behavior for the Push method: it ensures that the synthetic request will have the "Accept-Encoding: gzip" header. And, the handler disables connection hijacking. It needs to intercept several methods in order to change their behavior.
Will the ServerTrace struct expand to allow modifying all additional ResponseWriter methods? When new behaviors are added to the ResponseWriter, will they be added to the ServerTrace struct as well? That seems like a lot of API duplication.
Looks like this didn't happen before the Go 1.10 freeze. Bumping to Go 1.11.
Another data point of a package which has to deal with the combinatorial explosion of optional interfaces: https://github.com/prometheus/client_golang/blob/1cdba8fdde81f444626530cce66e138d2f68ae1c/prometheus/promhttp/delegator.go#L100
I have an alternative proposal for avoiding the combinatorial explosion of optional interfaces. It is inspired by the causer interface from github.com/pkg/errors (interface{ Cause() error }
). The stdlib documents that if you wrap a http.ResponseWriter
you should also implement this interface
type Wrapper interface {
Wrap() http.ResponseWriter
}
Then for example, if we want to get at the http.Hijacker
we can use code like this:
func GetHijacker(w http.ResponseWriter) http.Hijacker {
for w != nil {
if h, ok := w.(http.Hijacker); ok {
return h
}
wrapper, ok := w.(http.Wrapper)
if !ok {
break
}
w = wrapper.Wrap()
}
return nil
}
This doesn't solve the problem of getting at additional server event data that ServerTrace
would, but would help with the original problem proposed in this issue and the issues with wrapping ResponseWriter
in general.
Problem
A very common pattern in Go HTTP servers is to implement an
http.ResponseWriter
that wraps anotherhttp.ResponseWriter
and captures the status code. This is often used for logging and metrics collection.For example,
I've written something like this a bunch of times. You can find examples in nearly every Go web framework out there. One example is https://github.com/urfave/negroni/blob/master/response_writer.go#L13-L26.
There are some issues with this approach. For instance, my
statusCaptureWriter
doesn't implement other interfaces likehttp.Flusher
,http.CloseNotifier
orhttp.Pusher
. I can't determine at compile time whether the underlyinghttp.ResponseWriter
implementation implements any of these interfaces, so if I choose to implement them I might lie to callers at higher levels of the stack and inadvertently break things. (This particularly a problem with CloseNotifier.)Proposal (rejected, see below)
I'd like to propose an additional interface,
http.Statuser
(better name welcome) that exposes the status code within ahttp.ResponseWriter
implementation. The internalhttp.(*response)
implementation already tracks the status code written, so this can just be exposed and it will automatically implement this interface.Software could avoid wrapping the
http.ResponseWriter
by instead type asserting it tohttp.Statuser
and getting the status as needed there. (And it could optionally continue to wrap the ResponseWriter as needed until this is widely deployed.)Alternative proposal
Implement an
httptrace.ServerTrace
struct that is analogous to theClientTrace
already there. See https://github.com/golang/go/issues/18997#issuecomment-279611928 for more info.