nytimes / gziphandler

Go middleware to gzip HTTP responses
https://godoc.org/github.com/NYTimes/gziphandler
Apache License 2.0
868 stars 129 forks source link

Fix double compression issue #47

Closed olalonde closed 7 years ago

olalonde commented 7 years ago

This is a small patch which will ensure that gzip compression is disabled if the response has a Content-Encoding header which was not set by gziphandler.

This is intended to fix the following issue in traefik:

https://github.com/containous/traefik/issues/1511

Disclaimer: it's my first time writing Go code :/

jprobinson commented 7 years ago

Hi there. Others have attempted to make this same change for the same reason and I decided against it. My reasoning is here: https://github.com/NYTimes/gziphandler/pull/29#issuecomment-275178075

Basically, I believe making that change here is a patch for Traefik's poor design.

olalonde commented 7 years ago

I'm of course a bit biased but I disagree with your reasoning. Middlewares should attempt to play nice with each other and the Content-Encoding header being already set indicates that some middleware outside of this one has already encoded the response. I can't really think of a scenario where you would both want to set your own Content-Encoding header and expect gziphandler to gzip the response.

In Node.js land, this is also how the compress middleware behaves: https://github.com/expressjs/compression/blob/master/index.js#L164

At worse, couldn't we enable this behavior through a flag? e.g.:

gziphandler.GzipHandler(next, true /* skips gzip if Content-Encoding header is set */)
emilevauge commented 7 years ago

Traefik's poor design.

@jprobinson Maybe you could eventually consider we have different needs before making any jugement...

jprobinson commented 7 years ago

Hi there. I understand Traefik's use case, I just don't understand why a blanket 'compress' configuration for all backends is considered the right way to go. You know which backends you want compress and which you do not so the configuration should be more granular. It's a simple concept, you add this middleware when you want compression. If you don't need compression, don't add the middleware.

On top of all this, the only people asking for this feature are Traefik users, which is another reason I believe this is Traefik's problem.

emilevauge commented 7 years ago

@jprobinson

On top of all this, the only people asking for this feature are Traefik users, which is another reason I believe this is Traefik's problem.

It does not seem like a fair argument...

It's a simple concept, you add this middleware when you want compression. If you don't need compression, don't add the middleware.

I understand your point. But in a reverse proxy like Traefik, you may want to enable compression on all connections. This is a valid use case.

You know which backends you want compress and which you do not

In this use case, we don't.

Besides, I'm not sure it is safe to assume that only this gzip middleware is supposed to set Content-Encoding. In most cases, yes, but it does not covers every use cases.

As @olalonde said, it would be better to enable this (non default) behavior with a flag.

jprobinson commented 7 years ago

Stepping back, I do feel the phrase "poor design" was not the right wording and I'm sorry if it offending anyone here. We just disagree on where this one feature should live and I should not use this instance to reflect on your project as a whole.

I'm pretty firm on my decision so I'm afraid we're going to have to agree to disagree on this one, sorry.

olalonde commented 7 years ago

I just don't understand why a blanket 'compress' configuration for all backends is considered the right way to go

AFAIK, it's what both Cloudflare and Nginx do by default (and I guess most reverse proxies), so it's not a terribly controversial design choice. Anyways, I won't insist any further :)

adammck commented 7 years ago

I agree with @jprobinson that it's better for this middleware to simply gzip responses which flow through it, rather than trying to guess whether or not they should be compressed. That seems easy in the case that the payload is compressed and the encoding header is set correctly, but I could imagine a lot of strange and surprising ways that could fail, so let's not.

However, there's clearly a need to also do something. Could we perhaps move this switching functionality to a separate middleware? It could wrap gziphandler (or anything else, for that matter), but buffer headers until the first Write, at which time it calls a func to decide whether to use the wrapped middleware or not. I'd imagine it looking something like:

withoutGz := http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
  w.Header().Set("Content-Type", "text/plain")
  io.WriteString(w, "Hello, World")
})

withGz := gziphandler.GzipHandler(withoutGz)

isGzipped := func(h http.Header) {
  return h.Get("Content-Encoding") == "gzip"
}

http.Handle("/", MaybeDoThis(withGz, !isGzipped))
//               ^^^^^^^^^^^ <- my imaginary handler

What do you think, @jprobinson and @olalonde? We could reasonably keep this in a separate repo/package, and provide a nudge in that direction for those who need it without resorting to magic or feature flags in the main package.