mostynb / go-grpc-compression

go gRPC encoding wrappers for some useful compression algorithms that are not available in google.golang.org/grpc
Apache License 2.0
21 stars 10 forks source link

github.com/klauspost/compress for snappy #10

Open jzelinskie opened 2 years ago

jzelinskie commented 2 years ago

Hey there!

I noticed that this library doesn't use the snappy library from klauspost/compress, but it does import this library for zstd. A cursory look makes it appear to be faster. Is there a reason why you aren't using it? Is it worth a PR?

mostynb commented 2 years ago

Hi, I'm not as familiar with snappy compared to lz4 and zstd. I'd be happy to switch if you benchmarked and found @klauspost's implementation faster though.

klauspost commented 2 years ago

It can either be faster or compresses more and sometimes both.

https://github.com/klauspost/compress/tree/master/s2#snappy-compatibility

Replace snappylib "github.com/golang/snappy" -> github.com/klauspost/compress/s2. If you don't want options snappylib github.com/klauspost/compress/snappy.

    c.poolCompressor.New = func() interface{} {
        w := s2.NewWriter(ioutil.Discard, s2.WriterSnappyCompat())
        return &writer{Writer: w, pool: &c.poolCompressor}
    }

Options to consider: WriterBetterCompression gives better compression. WriterConcurrency(1) if you want to disable multithreaded compression.

    if !inPool {
        newR := s2.NewReader(r, s2.ReaderMaxBlockSize(64<<10))
        return &reader{Reader: newR, pool: &c.poolDecompressor}, nil
    }
mostynb commented 2 years ago

If we were to add s2, I think that should be registered with the grpc framework under the name "s2" instead of "snappy" for compatibility reasons.

klauspost commented 2 years ago

Yeah, adding it as a separate option would be nice (and trivial)

FWIW here is a comparison of different types: https://twitter.com/sh0dan/status/1532298056082804736

mostynb commented 2 years ago

@klauspost: re s2, if I use WriterConcurrency(1) can I safely put s2.Writer's in a sync.Pool without leaking memory?

klauspost commented 2 years ago

@mostynb You can do that either way. It does not keep goroutines active.

mostynb commented 2 years ago

OK, it's just that the s2.NewWriter docs say that resources might not be released if Close isn't called (as would happen with sync.Pool):

Users must call Close to guarantee all data has been forwarded to the underlying io.Writer and that resources are released. They may also call Flush zero or more times before calling Close.

mostynb commented 2 years ago

@jzelinskie: I created a PR which you can experiment with if you like: #12.