neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
14.39k stars 418 forks source link

logical size limit is broken during PS restart #5963

Open problame opened 9 months ago

problame commented 9 months ago

Problem

Logical size is part of PageserverFeedback, which is sent from PS to SK so that SK can enforce the project's logical size limit:

https://github.com/neondatabase/neon/blob/d8c21ec70d60f5e4a4675a16bc596cbf60eefc8f/pageserver/src/tenant/timeline/walreceiver/walreceiver_connection.rs#L398-L404

Logical size is calculated lazily. The value that is returned before it has been lazily calculated is the logical size delta since PS startup. If it's negative, we currently round it to 0.

The (quite common) worst case: whenever we restart the PS, there's a window in which we report a logical size that is way below the actual logical size, likely near 0. This allows a project to go over their logical size limit. Once we're done calculating, we report the correct value. But at that point, the user may be over the size limit. Which means they're using more logical size than they're allowed (and paying for?) .

Fixing This

We should not start walreceiver connections to SKs until we have an accurate logical size.

The challenge is that the logical size needs to be available quickly because walreceiver connection establishment is on the user-visible path, i.e., it's a latency-bound task.

Design Idea 1

(As a follow-up, also think about how this change impacts synthetic logical size calculations)

Design Idea 2

### Tasks
- [ ] https://github.com/neondatabase/cloud/pull/8317
- [ ] https://github.com/neondatabase/neon/pull/5982
- [ ] https://github.com/neondatabase/neon/pull/5995
- [ ] https://github.com/neondatabase/neon/pull/5999
- [ ] https://github.com/neondatabase/neon/pull/6018
- [ ] ship metrics & observe
- [ ] https://github.com/neondatabase/neon/pull/5994
- [ ] https://github.com/neondatabase/neon/pull/5955
- [ ] https://github.com/neondatabase/neon/pull/6000
- [ ] ship concurrency limit & observe
- [ ] https://github.com/neondatabase/neon/pull/6010
- [ ] ship fix & observe
- [ ] clean up after https://github.com/neondatabase/neon/pull/6018
hlinnaka commented 9 months ago

Idea 3: Store the logical size persistently as a separate key-value pair in the storage.

Whenever a relation is extended or truncated, update the logical size key-value pair too, in WAL ingestion.

That makes it fast to access the logical size, at any point in time, with no special caching required. The downside is that it adds work to the WAL ingestion codepath instead. Don't know how significant that is, but given how much trouble the logical size calculations are causing us, it might be the right tradeoff.

problame commented 8 months ago

Meeting notes today:

jcsp commented 3 months ago

We anticipate persisting snapshots of timeline logical sizes to remote storage in the near future to enable hibernated timelines (#8088 ), which should also enable us to ensure that we always have a logical size for a timeline. This may lag ingest a little bit after restart, but it will eliminate the 0 logical size phase.