neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.26k stars 407 forks source link

Epic: pageserver image layer compression #5431

Closed jcsp closed 4 days ago

jcsp commented 11 months ago

Background

We may substantially decrease the capacity & bandwidth footprint of tenants by compressing data in their image layers.

There are many possible implementations, from compressing whole layers files as streams, to introducing some chunked format and decompressing a chunk at a time, to simply compressing individual pages.

Compressing individual pages in image layers is by far the simplest thing to do, and should have a high payoff as:

Compressing deltas is a harder problem (individual deltas are likely too small to usefully compress), and is left as a possible future change.

Implementation

There is a preliminary version here: https://github.com/neondatabase/neon/pull/7091, which demonstrates that per-page compression in image layers may be added as a relatively lightweight code change.

To get this ready for production, there is more work to do:

PRs/issues

Rollout

problame commented 3 months ago

Last week:

This week:

koivunej commented 2 months ago

This week:

arpad-m commented 1 month ago

Week Jul 1-5:

Week Jul 8-12:

koivunej commented 1 week ago

From @Bodobolero's benchmarks: add lz4 support for comparison.

arpad-m commented 1 week ago

we talked about this in the call and agreed that until further investigation in which compression is identified as culprit, we will not spend developer time on this.

arpad-m commented 4 days ago

I think this can be closed now.