neondatabase / neon

Neon: Serverless Postgres. We separated storage and compute to offer autoscaling, code-like database branching, and scale to zero.
https://neon.tech
Apache License 2.0
15.03k stars 439 forks source link

pageserver: LSN lease edge cases around restarts/migrations #8817

Closed jcsp closed 1 month ago

jcsp commented 2 months ago

Some cases come up when we do pageserver restarts/migrations while LSN leases are in play, and the pageserver's gc_cutoff has advanced past the lease.

  1. Validation of lease LSN vs. gc_cutoff: this makes sense for first request but not renewals. We cannot know the difference from the server side, so: page_service lease requests should not do the validation (these are used for renewals), but http API requests should do the validation (only used for initial grant of lease).
  2. getpage requests are validated against latest_gc_cutoff, with a special case for requests at leases. However, after pageserver restart/migration, there is a period where the compute may send getpage requests before it has sent its lease requests. To narrow this window, we should make compute_ctl send a lease renewal as soon as it sees a /configure API calls that updates pageserver_connstr.

On point 2, the suggested change doesn't eliminate the case of bad requests, but it limits to:

We must somehow document that this is a legitimate case where we might see "client requested an LSN that has been GC'd", so that we don't get too worried if we ever see this

ololobus commented 2 months ago

I do not fully understand the motivation of validation removal:

  1. In 1. it's proposed that we do not do validation in libpq protocol, but only do it in HTTP.
  2. Yet, in 2. it's mentioned that if the pageserver is restarted, for a short period, we do not have any lease, so once compute tries to renew it, it will be kinda 'acquire', not 'renew', right?

(the race explanation part looks valid, though)

A bit of a context. From the compute point of view, it's valuable to know that lease renewal failed permanently, and it doesn't make much sense to retry. Then we can stop retrying, set an internal error state; and in theory, we can teach cplane to notice that and shut down the compute. So I'd still prefer to have some permanent error in both APIs. Discussed that with @yliang412 a bit