Closed jcsp closed 1 month ago
I do not fully understand the motivation of validation removal:
(the race explanation part looks valid, though)
A bit of a context. From the compute point of view, it's valuable to know that lease renewal failed permanently, and it doesn't make much sense to retry. Then we can stop retrying, set an internal error state; and in theory, we can teach cplane to notice that and shut down the compute. So I'd still prefer to have some permanent error in both APIs. Discussed that with @yliang412 a bit
Some cases come up when we do pageserver restarts/migrations while LSN leases are in play, and the pageserver's gc_cutoff has advanced past the lease.
On point 2, the suggested change doesn't eliminate the case of bad requests, but it limits to:
We must somehow document that this is a legitimate case where we might see "client requested an LSN that has been GC'd", so that we don't get too worried if we ever see this