grafana / mimir

Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.
https://grafana.com/oss/mimir/
GNU Affero General Public License v3.0
3.98k stars 503 forks source link

zstd compression support between distributors and ingesters #8522

Open aallawala opened 2 months ago

aallawala commented 2 months ago

Is your feature request related to a problem? Please describe.

Are there plans for zstd support between distributors and ingesters using the grpc_compression flag?

Enabling zstd compression would help significantly reduce the amount of cross-AZ traffic our org pays for running Mimir. gzip gets some wins, at the expense of CPU.

Now that there is a pure Go zstd implementation, are there any major blockers for introducing it to Mimir?

Describe the solution you'd like

Support for zstd when using the grpc_compression. A clear and concise description of what you want to happen.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context or screenshots about the feature request here.

aknuds1 commented 2 months ago

Are you specifically asking for the -ingester.client.grpc-compression flag to also support zstd?

aallawala commented 2 months ago

Are you specifically asking for the -ingester.client.grpc-compression flag to also support zstd?

I'm specifically looking for compression support on the ingester-client, yes. That's our most expensive client.

Though, judging by how the config is laid out, I suspect it is a shared compressor for all the gRPC clients?

aknuds1 commented 2 months ago

Though, judging by how the config is laid out, I suspect it is a shared compressor for all the gRPC clients?

In effect yes. I just needed more clarification, as your description was relatively vague. I'm trying to see whether the team has any opinion on supporting zstd.

aknuds1 commented 2 months ago

So, I found out that the Mimir team already has considered zstd support, but the zstd implementation we considered is apparently not very performant.

Is that the same zstd implementation as you're referring to?

aallawala commented 2 months ago

So, I found out that the Mimir team already has considered zstd support, but the zstd implementation we considered is apparently not very performant.

Is that the same zstd implementation as you're referring to?

Thanks for checking in. I'm looking at https://github.com/klauspost/compress/tree/master/zstd as a possible zstd implementation. It looks like what you linked is a wrapper on top of klauspost/compress.

What were the reasons why it was not performant? I'm surprised to hear it was not compared to the other offering, gzip. Surely it strikes in the middle of gzip and snappy but with a much higher compression ratio.

aallawala commented 1 month ago

@aknuds1, checking in - are you able to share more on the reasons why that implementation was not performant?

aknuds1 commented 1 month ago

I just know that @bboreham considered the aforementioned zstd implementation, and found performance problems with it.

When we discussed trying another compression algorithm for cheaper AZ traffic internally, the other point made was it wasn't worth all the testing that comes along with it.

aknuds1 commented 1 month ago

I have some new context to share to shed hopefully enough light on why the Mimir team doesn't think the investment into a new compression algorithm for distributor -> ingester traffic is going to be worthwhile. I don't know if you're perhaps already familiar with this, but Mimir is moving towards a fundamentally revamped architecture, where distributors will no longer be writing directly to ingesters, but instead to a Kafka compatible back-end, which ingesters again read from. Please see [FEATURE] Experimental Kafka-based ingest storage in the changelog for reference.

bboreham commented 1 month ago

I remarked on the amount of memory allocations it does. Looks like this was improved recently, with the addition of a pool in https://github.com/mostynb/go-grpc-compression/commit/629c44d3acb9624993cc7de629f47d72109e2ce5.

Someone else commented https://github.com/mostynb/go-grpc-compression/issues/25

deniszh commented 14 hours ago

@aallawala @aknuds1 @bboreham

I decided to experiment with zstd and s2 GRPC compression from https://github.com/mostynb/go-grpc-compression. So, I created patched dskit, with zstd and s2 compression and then built mimir 2.13.0 with that patched dskit - it's nothing fancy, just dskit pointed to patched version.

Then I ran this patched version in our test cluster, first with zstd compression, then s2. Please note, that before test test cluster already run Mimir 2.13.0 with Snappy GRPC compression enabled, so, comparison below is to snappy and NOT uncopressed:

Results:

Zstd results comparing to Snappy:

Image

Image

So, results are good, but memory usage is tremendous. I checked heap, majority of memory consumed by zstd.EnsureBlock:

Image

But I also have a good news too, when I tried to switch to s2 compression, which is better version of snappy.

S2 results comparing to Snappy:

Image

(left is Snappy, middle - zstd, right part - S2)

So, after that I decided to drop zstd (first I wanted to test Datadog or another wrapper but then I realized that Mimir has no CGO enabled) and S2 looks really promising.

I know that Mimir architecture is revamped and will be migrated to Kafka, but it will take some time to implement that properly - and S2 patch can shave 50% of cross AZ data cost right now.

And I don't like run patched version and prefer to port my changes to upstream. I can clean up and submit PR for experimental S2 support in dskit. Should I or no sense?

aallawala commented 14 hours ago

@deniszh, thanks so much for trying it out and posting your results. The memory increase seems to match up with what @bboreham said earlier in his findings too.

Thanks for also trying out s2. The results seem much more favorable and it would be something I can also help drive in order to get this ported upstream. Do you have a PR available on the dskit side for s2?

deniszh commented 13 hours ago

@aallawala : https://github.com/grafana/dskit/pull/582

But please take note that latest mimir is not ready to latest dskit. So, if you want to test you can build mimir from this branch https://github.com/deniszh/mimir/tree/2.13.0-grpc-s2

deniszh commented 13 hours ago

Tried S2 on bigger cluster - it's less impressive but I see 70MB/s instead of 105MB/s - 30% decrease.