nytimes / gziphandler

Go middleware to gzip HTTP responses
https://godoc.org/github.com/NYTimes/gziphandler
Apache License 2.0
868 stars 129 forks source link

add NewGzipLevelHandler for configurable compression level #14

Closed cgilling closed 8 years ago

cgilling commented 8 years ago

Compression was taking up way too much of my cpu, so I wanted to reduce that load some, so I implemented this.

I would appreciate recommendations of how to test this if anyone can think of one

adammck commented 8 years ago

Thanks for the patch! I agree that we should allow the compression level to be set, but I don't think it's very good to be setting it after the handler has been created, especially not by messing around with globals.

Perhaps we could add a GzipHandlerLevel func (to match the NewWriterLevel func in compress/gzip), and modify the existing GzipHandler to call that with the default level? We'd need to move the pool inside the handler, so it can spawn gzippers with the right level, but otherwise I think the change would be quite small.

cgilling commented 8 years ago

With respect to GzipHandlerLevel this is something that I could definitely do but there is one drawback in general and one specific to my use case:

  1. having the sync.Pool belong the the GzipHandlerLevel func would mean that there is no easy way to share the sync.Pool across multiple handlers. This could be mitigated by having something like GzipHandlerLevelMiddleware what returns a func(h http.Handler) http.Handler that you could use to wrap multiple endpoints, but that does get a bit more complicated. Perhaps not sharing the sync.Pool between endpoints is acceptable?
  2. Currently to integrate this with the current framework I am using I use a closure tho wrap each call so that I can pass variables through the the wrapped handler. This would mean that I would now be creating a new sync.Pool on every call

Due to these two things, my preference would just be to force that the compression level can only be set before any usage. I understand that might not be the direction you'd like to take this library so if you'd still prefer something like GzipHandlerLevel I could implement something like that and then I could probably start using it once go 1.7 comes out with native support for context

cgilling commented 8 years ago

@adammck, do you have a preference on what I said in my previous comment? I'd like to try to finish this up in an acceptable way but I want to make sure I heading in the right direction first. Thanks

adammck commented 8 years ago

Hello! Sorry for the delay getting back to you.

I didn't consider the case of wrapping multiple handlers. I usually attach all of my handlers to a router (e.g. gorilla/mux), and call GzipHandler once to wrap that. If you're wrapping routes individually, you're quite right that we should avoid creating one pool for each.

So, we can't have handlers with different compression levels in the same pool, but I see no reason that we couldn't have separate pools for each level (since that'd be the only difference between the gzippers which they contain), and share them between every handler which wants that level. I'm imagining something like:

var gzipWriterPools = map[int]sync.Pool{}
var mut = sync.Mutex{}

// --

func GzipHandlerLevel(h http.Handler, level int) http.Handler {
  // ...
  if acceptsGzip(r) {
    gzw := getWriter(level)
    defer putWriter(level, gzw)
  // ...
}

//--

func getWriter(level int) *gzip.Writer {
  mut.Lock()
  def mut.Unlock()

  p, ok := gzipWriterPools[level]
  if !ok {
    p = makePoolLevel(int)
    gzipWriterPools[level] = p
  }

  gzw := p.Get().(*gzip.Writer)
  gzw.Reset()
  return gzw
}

func makePoolLevel(level int) sync.Pool {
  return sync.Pool{
    // ...
  }
}

Does this make sense? I think it mitigates the problem of setting the compression level globally, but doesn't preclude your use-case of many handlers.

elithrar commented 8 years ago

The multiple-pool approach is also what we considered (and I have to find time to implement) in gorilla/handlers. The map[int]sync.Pool approach is a nice way to do it.

cgilling commented 8 years ago

@adammck, I implemented something very similar to what you suggested, but made the pools for each level up front which avoided the need for a mutex, but does mean that we create the pools even if they aren't needed, I assumed that the pools wouldn't take up much memory so this should probably be okay, if you think thats too much overhead I can change it so that they are created on use.

Also because there is a potential of passing an invalid level I created NewGzipLevelHandler which returns a wrapper and an error. The are other potential ways of handling this, but this is the one I am most used to so I implemented it this way. The other two I could think of to avoid having to deal with the error would be:

  1. panic if an invalid level is given (generally not a good thing for a library to do)
  2. use gzip.DefaultCompression for all invalid levels

Lemme know what you think

adammck commented 8 years ago

It's fine for the NewGzipLevelHandler to return an error; that's what NewWriterLevel does, and the caller can always ignore it if they're sure that the level is valid. (It does make the interface a little awkward-- perhaps we should add a MustNewGzipLevelHandler which panics on error, to simplify the interface for those users who are sure?)

I agree that it's not a huge deal to spawn the pools in advance, although it bugs me that they'll be unused in almost every case. I'm okay with merging this implementation, but might switch it out for a mutex later.

cgilling commented 8 years ago

@adammck, sorry for taking so long to finish up this patch, its been pretty busy lately. I changed the map to an array and also implemented MustNewGzipLevelHandler as well.

cgilling commented 8 years ago

@adammck ping

adammck commented 8 years ago

Looks good to me. Thanks very much for the patch, @cgilling!