IBM / sarama

Sarama is a Go library for Apache Kafka.
MIT License
11.59k stars 1.76k forks source link

Zstd encoder limit and raise encoder cache size #2979

Open rtreffer opened 2 months ago

rtreffer commented 2 months ago

This PR addresses #2965 with a different approach.

This PR addresses 2 issues with the current implementation

  1. The number of in-use zstd encoders can exceed GOMAXPROCS if a large number of goroutines are used
  2. The number of cached encoders is too low for highly parallel sarama use, leading to repeated encoder creation and thus low throughput

The PR preserves the following property

The memory behavior of applications can change slightly. Before applying the patch:

After applying the patch:

This should not change the worst case for the great majority of users, but it might be relevant in cases where applications were alternating between high sarama use and other uses.

There are 2 new benchmarks and a testing flag (zstdTestingDisableConcurrencyLimit) to verify the concurrency limiting. I've also added some more information to the tests (like setting the bytes so throughput can be measured). Here is a sample output from my machine (AMD framework 13):

# go test -benchmem -run=^$ -test.v -bench ^BenchmarkZstdMemory github.com/IBM/sarama
goos: linux
goarch: amd64
pkg: github.com/IBM/sarama
cpu: AMD Ryzen 7 7840U w/ Radeon  780M Graphics     
BenchmarkZstdMemoryConsumption
BenchmarkZstdMemoryConsumption-16                             16          68034969 ns/op        2959.16 MB/s            96.00 (gomaxprocs)               1.000 (goroutines)     21974595 B/op           815 allocs/op
BenchmarkZstdMemoryConsumptionConcurrency
BenchmarkZstdMemoryConsumptionConcurrency-16                  39          30498097 ns/op        8801.71 MB/s             4.000 (gomaxprocs)            256.0 (goroutines)       86327669 B/op          1479 allocs/op
BenchmarkZstdMemoryNoConcurrencyLimit
BenchmarkZstdMemoryNoConcurrencyLimit-16                      21          52053651 ns/op        5156.90 MB/s             4.000 (gomaxprocs)            256.0 (goroutines)       437548737 B/op         2196 allocs/op
PASS
ok      github.com/IBM/sarama   3.566s