Open ankon opened 4 months ago
Initialize and cleanup are sort-of internal. Could you point me to the changes regarding gzip, so I could investigate the source of performance improvements. I could assume that performance is impacted by JNI roundtrips. If that is true, then we could think of something to reuse / refurbish instances.
So for gzip a lot of places1,2 suggest this approach (pseudo-golang here):
writerPool := &sync.Pool{
New: func() any {
writer, _ := gzip.NewWriterLevel(nil, g.Level)
return writer
},
}
// later when using it to wrap an existing output writer
var w io.Writer
// Grab a possibly pooled gzip writer
writer := writerPool.Get().(*gzip.Writer)
// Reset its structures and point it to now write to w
writer.Reset(w)
Doing some benchmarks (see [1] for an idea) also for us showed that the number of allocations went down quite a bit, and I think that might be due to internal caches and buffers getting reused.
See also gzip.Writer's Reset
documentation
We saw good performance improvements in our go program when using a sync.Pool to cache and reset gzip writers rather than creating them anew each time.
I was wondering whether the same would be true for the brotli writer, too, but that one doesn't seem to have a trivial "reset this" method. On the underlying C level it looks like an encoder state can be "created"/"destroyed", but there are functions to "initialize" and "cleanup" as well.
Before I try to hack this together: Do you think that pooling/reusing of the encoder state could have a measurable/visible performance impact?