go-chi / chi

lightweight, idiomatic and composable router for building Go HTTP services
https://go-chi.io
MIT License
18.39k stars 983 forks source link

HEAD for GET routes #238

Closed alehano closed 7 years ago

alehano commented 7 years ago

Standard go http router allows to do HEAD requests to GET endpoints and return the same response but without body. Chi response 405 for GET routes with HEAD method.

pkieltyka commented 7 years ago

Make sure to specify .Get() and .Head() requests separately. 405 measures method is not allowed, which leads me to think you didnt define it.

alehano commented 7 years ago

Yes, I can set .Get() and .Head() method for the same route. But It's duplicate code. Maybe you can add .GetHead() method to set GET and HEAD requests for the same route or something like this? But for me it's better to respond for HEAD requests in .Get() directly. Web crawlers and robots may do HEAD requests, so I want always to respond, instead they can think my page is broken.

According to HTTP/1.1 RFC:

9.4 HEAD The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

vektah commented 7 years ago

A way to automatically respond to a HEAD request using the GET handler if a HEAD handler hasn't been provided would be nice :+1:

pkieltyka commented 7 years ago

@alehano I'll give this some more thought. I could add .GetHead() but, if they both point to the same handler then it will be up to the handler to ensure there is no message body returned based on r.Method == "HEAD" in the handler.

I could set an EmptyBodyHandler set as the default for a HEAD in that situation, unless overridden with a .Head(). The changes to logic is why chi settles to do less and leaves it to the developer's choice.

My one thought is to offer chi.Options{} as an argument to NewRouter() to set such a behaviour, but it goes against my wishes to keep chi super lean and small.

alehano commented 7 years ago

I love that chi lean and small. Now I made a helper:

func GetHead(r chi.Router, pattern string, h http.HandlerFunc) {
    r.Get(pattern, h)
    r.Head(pattern, h)
}

And it works as expected. Go http decide return body or not, based on method. Bit it's doesn't looks nice, as well as adding .GetHead() to the chi. Why not just make .Get() method like this:

func (mx *Mux) Get(pattern string, handlerFn http.HandlerFunc) {
    mx.handle(mGET, pattern, handlerFn)
    mx.handle(mHEAD, pattern, handlerFn)
}

Maybe with checking if Head() not already added.

pkieltyka commented 7 years ago

@alehano it doesnt feel right to me to have GET and HEAD do the same thing as a default. But if others feel that is how it should work, I can make a concession for sure and add that line.

alehano commented 7 years ago

@pkieltyka Ok, I also want to hear other's thoughts. For me, I don't see useful cases to have separate .Head() method. Because it always must be the same as GET, but without body.

Btw this how Go http package handle this: https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L226 https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L105 https://github.com/golang/gofrontend/blob/master/libgo/go/net/http/transfer.go#L110

func noResponseBodyExpected(requestMethod string) bool {
    return requestMethod == "HEAD"
}

// ...
t.ResponseToHEAD = noResponseBodyExpected(t.Method)
// ...

if t.ResponseToHEAD {
    t.Body = nil
    if chunked(t.TransferEncoding) {
    t.ContentLength = -1
    }
} 
c2h5oh commented 7 years ago

@pkieltyka he has a point:

https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html

9.4 HEAD

The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response. The metainformation contained in the HTTP headers in response to a HEAD request SHOULD be identical to the information sent in response to a GET request. This method can be used for obtaining metainformation about the entity implied by the request without transferring the entity-body itself. This method is often used for testing hypertext links for validity, accessibility, and recent modification.

The response to a HEAD request MAY be cacheable in the sense that the information contained in the response MAY be used to update a previously cached entity from that resource. If the new field values indicate that the cached entity differs from the current entity (as would be indicated by a change in Content-Length, Content-MD5, ETag or Last-Modified), then the cache MUST treat the cache entry as stale.

emphasis mine.

DmitriyMV commented 7 years ago

IIRC currently neither gorilla/mux nor julienschmidt/httprouter do that, so the implicit conversion from GET to HEAD doesn't look THAT useful. I may be wrong about that and in no way state that this shouldn't be done. I'm just a bit concerned about additional work that app may need to do, and which is going to be wasted.

The standard mux analogy is interesting, but Go default mux have several pain points, so I wouldn't call it a great example.

alehano commented 7 years ago

One more reason of uselessness separate .Head(). For example, you want to add handler to .Head() for adding some Header values. But in this case it'll remove Content-Length header. So it's contradicts to RFC. Or you have to calculate Content-Length of correspondent GET method, which is unlikely somebody will do it.

DmitriyMV commented 7 years ago

@alehano Not exactly - https://stackoverflow.com/questions/3854842/content-length-header-with-head-requests

If our GET method result is dynamically generated - this clause can be averted.

DmitriyMV commented 7 years ago

Well - in case our GET request handler has to fetch data from remote sources (such as DB, storages, API) - our "implicit" HEAD handler which will be actually a GET handler and it will be forced to get and wait the required data. And then discard it.

This is also concerning when we have cache system with complex rules - I can imagine the scenario when the cache gets invalidated two times - on HEAD and then on GET.

VojtechVitek commented 7 years ago

If we do the automatic HEAD response, should we also think of having something similar for OPTIONS?

pkieltyka commented 7 years ago

@DmitriyMV if we do implicitly set Head() in Get(), I would still allow users to override the Head handler by another explicit call to .Head() after the .Get(), so that should take care of it.

vektah commented 7 years ago

Rather than get setting head, I was thinking more along the lines of a HEAD request falling through to get at request time:

Additionally when a HEAD request is made to a GET handler, immediately after the headers are sent the context could be cancelled, avoiding external work and database calls where its supported.

DmitriyMV commented 7 years ago

Please, NO implicit settings. If I want this behavior I will enable it explicitly. Adding new method is better than redefining behavior of the existing ones.

I would also like to point out, that HEAD lookup time will increase depending on the size of existing GET routes.

vektah commented 7 years ago

Please, NO implicit settings. If I want this behavior I will enable it explicitly.

What about something like

router.MethodNotAllowedHandler = chi.PromoteHeadToGet

I would also like to point out, that HEAD lookup time will increase depending on the size of existing GET routes.

I'm not sure that's true, last I looked the same tree node had all methods on it. Admittedly its changed a bit since regexs were introduced.

DmitriyMV commented 7 years ago

I think it will be router.MethodNotAllowed(chi.PromoteHeadToGet).

pkieltyka commented 7 years ago

I think the approach that makes sense is:

  1. Create new chi pkg-level emptyBodyHandler which just executes w.Write(nil)
  2. Implicitly set the HEAD method in the Mux#Get method as so..
    func (mx *Mux) Get(pattern string, handlerFn http.HandlerFunc) {
    mx.handle(mGET, pattern, handlerFn)
    mx.handle(mHEAD, pattern, emptyBodyHandler)
    }

Users can still explicitly set the HEAD for a route with Mux#Head

thoughts?

DmitriyMV commented 7 years ago

That solution will not set the headers which GET handlerFn could set. The problem as I see it currently is that we want the GET handler headers but we do not want GET handler body.

alehano commented 7 years ago
router.MethodNotAllowed(chi.PromoteHeadToGet)

Looks good. If it possible to implement that way. Standard go http (on which chi built), already get rid of body in case HEAD request. So, we dont need to invent bycicle, just need a way to set the same handler for two methods.

pkieltyka commented 7 years ago

router.MethodNotAllowed(chi.PromoteHeadToGet) is clever way to enable this option but it might be too slick. I'll give this some more thought considering everyone's feedback. Thanks all.

VojtechVitek commented 7 years ago

Another approach:

r := chi.NewRouter()
r.Use(HeadForceEmptyBody)
r.GetHead("/handler", MyHandler) // both HEAD and GET
func HeadForceEmptyBody(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
                if r.Method == "HEAD" {
                    w = middleware.NewWrapResponseWriter(ioutil.Discard, 1)
                }
        next.ServeHTTP(w, r)
    })
}
vektah commented 7 years ago

Yeah that would work, and it's pretty straightforward. The only thing I dont think like about it is it leaves the solo Get broken from a http perspective. We can catch it with a linter pass, but ideally the default is safe after configuring the router.

DmitriyMV commented 7 years ago

No changing of the default behavior, please.

DmitriyMV commented 7 years ago

The middleware approach looks interesting, but it implies that we didn't define any proper HEAD handlers. The MethodNotFound approach is "pay later" rather than "pay upfront". Personally, I like to "pay later".

alehano commented 7 years ago

w = middleware.NewWrapResponseWriter(ioutil.Discard, 1)

This code doesn't work (wrong type).

VojtechVitek commented 7 years ago

The code was just off top of my head, sorry. This should work, tho:


type discardResponseWriter struct {
    http.ResponseWriter
    discard io.Writer
}

func (w *discardResponseWriter) Write(p []byte) (int, error) { return w.discard.Write(p) }

func NewDiscardResponseWriter(w http.ResponseWriter) http.ResponseWriter {
    drw := &discardResponseWriter{
        ResponseWriter: w,
        discard:        ioutil.Discard,
    }
    return middleware.NewWrapResponseWriter(drw, 1)
}

func HeadForceEmptyBody(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        if r.Method == "HEAD" {
            next.ServeHTTP(NewDiscardResponseWriter(w), r)
            return
        }
        next.ServeHTTP(w, r)
    })
}
alehano commented 7 years ago

The code was just off top of my head, sorry. This should work, tho:

It doesn't work by default either. You still have to have defined .Head() route. But if you have it, Go's standard library do all necessary work for trimming Body.

VojtechVitek commented 7 years ago

It doesn't work by default either. You still have to have defined .Head() route.

Of course, that's why I used r.GetHead(), as a prototype.

But if you have it, Go's standard library do all necessary work for trimming Body.

Really? That's great, I didn't know that.

lwc commented 7 years ago

The emptyBodyHandler in https://github.com/go-chi/chi/issues/238#issuecomment-319984775 isn't needed, looks like ResponseWriter.Write does nothing for HEAD requests right in the standard lib.

Also, IMO it'd be simpler to update the HEAD request codepath to check for the existence of a GET handler before returning method not allowed.

pkieltyka commented 7 years ago

cool. I will do something of this sort and send a PR

pkieltyka commented 7 years ago

let me know what you guys think -- https://github.com/go-chi/chi/pull/247

pkieltyka commented 7 years ago

fixed in 1b51a16ac893c99cc5978fca77249701129eedaf

DmitriyMV commented 7 years ago

@pkieltyka Sigh. You just broke the backward compatibility. Can we please have any sort of switch to turn this off or on?

Also, please don't do things like that with future minor versions. Reserve them for major ones. I like chi, but with changes like this one, I will be forced to move my company projects away from it. Sorry.

pkieltyka commented 7 years ago

@DmitriyMV are you referring to the FindRoute() definition change? or the fact that an undefined HEAD route will match its GET route? I believe by the RFC a HEAD should exist for GET routes: https://github.com/go-chi/chi/issues/238#issuecomment-319682229

You of course can still explicitly define a Head handler for a route, etc.

DmitriyMV commented 7 years ago

@pkieltyka I mean that I currently have custom defined NotFoundHandler handler and you just broke my code.

pkieltyka commented 7 years ago

I have just released the change under https://github.com/go-chi/chi/tree/v3.2.0

You can use https://github.com/go-chi/chi/tree/v3.1.5 for compatibility with your app.

pkieltyka commented 7 years ago

@DmitriyMV custom not found handlers still work as before, perhaps you had some weird logic handling HEAD requests in a custom not found handler? if so, perhaps setup a middleware looking for r.Method == "HEAD" and respond that way, which is a better approach (if you're doing some caching stuff)

again, we have version control / dependency management for a reason

DmitriyMV commented 7 years ago

@pkieltyka

This code:

package main

import (
    "net/http"

    "github.com/go-chi/chi"
)

func main() {
    r := chi.NewMux()
    r.Get("/hello", func(w http.ResponseWriter, r *http.Request) {
        println("I was called with METHOD", r.Method)
        w.Write([]byte("Greetings!"))
    })
    r.MethodNotAllowed(func(writer http.ResponseWriter, request *http.Request) {
        if request.Method == "HEAD" {
            println("I'm HEAD request! Hurray!")
        } else {
            println("I'm " + request.Method + " method request")
        }
    })
    http.ListenAndServe(":3333", r)
}

Behaves differently after commit 1b51a16ac893c99cc5978fca77249701129eedaf. Now there is no way to define such handler at all.

I agree about VCS point. But I also think, that inviting people to discuss breaking change, only to accept it unconditionally after two hours and no discussion is not a good way to have any sort of discussion. Given the fact, that better solutions were proposed without breaking existing code.

Regardless I thank you for your time and effort and I will show myself out.

pkieltyka commented 7 years ago

indeed, that will not work as you've outlined there. But, Im not sure what you're trying to solve. Can you explain the logic you have where you must have a MethodNotAllowed handler that checks for HEAD requests to execute some logic? can you describe your use case.

DmitriyMV commented 7 years ago

CORS requests. I know you can use "/" - tho not every case. I also know that your new behavior actually contradicts RFC because we are still returning headers which we do not guarantee to not change between requests.

Anyway - the deed is done. I see no point in continuing this discussion. I wish you all the luck, but from now I will stay away from chi and advice my colleagues to do the same.

pkieltyka commented 7 years ago

an alternate approach: https://github.com/go-chi/chi/pull/248 - I do prefer its explicitness in setting this behaviour, as well better than MethodNotAllowed which feels like abusing the API.

pkieltyka commented 7 years ago

one more approach: https://github.com/go-chi/chi/pull/249

pkieltyka commented 7 years ago

@DmitriyMV / others - chi is a project Im obviously passionate about and I've spent a number of years designing and iterating with great care. I do my best to be responsive and take my time with implementations because I care about the architectural integrity of the project. However, like everyone else in OSS, I have tons of other responsibilities outside of this project and for the most part it's maintained by just myself - sometimes, I will make quick judgements and they cannot all be perfect. I should have waited for the feedback of others before merging/making a release, but I was excited to come to a simple solution so I hit the big green "merge" button. But guess what, we can iterate forward easily - I've already offered two alternate solutions, and you can lock onto a previous tag until one lands.

In after 2 years of working on chi, no one has ever told me they will recommend against it because of a minor commit - thats unfortunate. Sure, recommend against it because theres a better option or a fundamental flaw, but not because you didn't have your way. This is my first OSS project with larger adoption, and its interesting to see the sentiment after a small difference in preference. I can only imagine how hard the cryptocurrency space must have it.

btw, side note: depending on what you're building.. I highly recommend checking out gRPC and as well if you're targeting the Web/browsers, check out https://github.com/improbable-eng/grpc-web. gRPC in my opinion is the future for writing APIs and services - and its getting better everyday.

DmitriyMV commented 7 years ago

Look, I get it. I got angry too because I spent last 20 days tracking this issue, and searching for the better solution (I actually did look into RFCs and "headers problems"), even notifying you in your merge request about explicit switch variable, so you could think about it. Only to find it merged without any sort of answer, tho you yourself invited other people to discuss the proposed solution. So - yes, I got mad.

The actual proper solution should remove the set of headers, but the problem is - we cannot change ResponseWriter.Headers after we had called Write or WriteHeader. This is problematic because we should not return Content-Length for HEAD requests if the response will be dynamic. So at this moment, I really don't have a better solution other than proxying ResponseWriter which will get into GET handler for the HEAD request and modify it afterward in our "middleware". That, in turn, results in another problem - that ResponseWriter can contain a growing set of interfaces (more here github.com/felixge/httpsnoop). So atm - I really don't have any proper fix.

And this is the main reason why I got angry - how easy it was for you to break the existing behavior, even though the solution doesn't introduce proper fix. I know that breaking changes are sometimes necessary - you can't compromise your clients' security, for example. You may also want to change API because it can be simplified or made more robust. Or just because you think it will be for the better.

But you should always remember, that in OSS, other people depend on your code. And the more project matures, the more people use it. They are willing to discuss your ideas. They are willing to help you. They are even willing to migrate if your changes are actually good, and you provided them a "transition period". They, essentially, trust you. And when you break this trust, bad things happen...

I cannot monitor every commit and change individually - because I too have a lot of responsibilities and routine on my day to day job. This is why we have semver - so we collect the things we want to fix and only when we absolutely sure, we break the existing code. I pinned my code to version 3.1.5, but that does not mean others did it too. What is more important - that pin removes my ability to receive code updates - including security ones, if you find and fix a bug in your later versions.

Again - I'm not against breaking changes. I know that software evolves. But I'm against breaking it sporadically without really serious considerations or giving people a transition period. This kills any sort of trust, because other people's code has "just stopped working" even if they had used it incorrectly from the semantic point of view.

pkieltyka commented 7 years ago

All good man. If you can look at https://github.com/go-chi/chi/pull/248 and let me know what you think - I am leaning towards that solution since it provides most flexibility to users through middleware. It adds a few APIs to deal with the routing during execution. It adds slightly more overhead for HEAD requests that use this middleware, but its negligible in a real-world scenario.

Also, check out: https://stackoverflow.com/questions/3854842/content-length-header-with-head-requests

DmitriyMV commented 7 years ago

I did - I actually was the one who linked this answer in this issue 😉. There is the even direct answer to the question:

So the real answer is, you don't need to include Content-Length, but if you do, you should give the correct value.

So even if your new solution is no longer breaks existing code, it still is incorrect. I only glanced at the code - do you really need to re-assign Routes field variable every time?

I will look further into it, including the ability to change headers later this or next week. Currently, I tend towards this https://github.com/felixge/httpsnoop/blob/master/wrap_generated_gteq_1.8.go#L66 even tho it looks horrid.

P.S. Yes - we did look into GRPC - most of our new services are using github.com/grpc-ecosystem/grpc-gateway. Not all of them tho.

pkieltyka commented 7 years ago

I'll look into what is the correct behaviour of the HEAD later, that can be solved in the middleware impl. But regarding the PR, I will have to re-assign Routes each time to make sure the context state is valid before its routed, it its also just a pointer value so there isn't much overhead. I benchmarked the execution and it has negligible impact.

VojtechVitek commented 7 years ago

I like both #248 and #249 solutions. But imho, I'm leaning towards the Middleware approach.

248 - Middleware approach

FindHandler() is something that might be very powerful for lot more use cases, where you can change the route dynamically based on request and/or context. However, it can introduce endless routing loop if used incorrectly. Do you guys have any more use cases where this could be useful?

func FindHandler(rctx *Context, method, path string) http.Handler

Do we need to pass context into the function? And why?

I'm thinking of another use case that could be useful in tests and/or docgen. Take a look at this signature:

func Find(method, route string) (http.Handler, ...func(http.Handler) http.Handler)

it would be more consistent with chi.Walk() and you could for example use it in your tests to get list of middlewares for a specific endpoint, ie. to check if you applied some ACL middleware. That could be very handy.

249 - Config approach

The Config approach is very simple and elegant in this context. However, once we introduce Config, we'd probably end up adding more and more setting fields into it, effectively making the API more complex with each such release. I like the simplicity of chi at this moment and I'm asking myself a question - do we need Config in the long run - or can we live without it and keep making strict decisions?

How would Config work with sub-routers, by the way? Would we need to pass the same Config when initializing the sub-routers? Or is the root Config propagated down the chain somehow automatically?