Open mkeeter opened 4 months ago
So, I'm not sure I see the value here in this, but let me explain some of what I'm thinking and I want to hear more about your idea as well.
I think many (all?) of the states in DsState
are about the upstairs policy, and in most cases
the actual downstairs themselves don't know what their state is in the upstairs. We have to
keep state about each downstairs somewhere in the Upstairs, so I'm curious to hear why the
new way is better than what we have now.
True that the Replay
state is not bearing much load any longer as we don't know when a
downstairs has "caught up", and, given the code changes with the refactor, it's probably vestigial
at this point. Though, we do need to know if a read IO is a replay to know if we can ignore
checksum mismatches between it an other IOs that could have different data.
For the collapse of Faulted
and Offline
, sometimes we know that a downstairs is bad right away
(Like, it has returned a write error, or failed a repair command). In those cases there is no reason to
hold any jobs for that downstairs, and it must come back through LiveRepair
, so we can skip them
and free any resources without having to wait for the downstairs to return to do so. Also, if a downstairs
has gone Offline
and we have reached the 1Gib/10K jobs limit, we won't keep holding IOs for that
downstairs any longer, right? And new IOs that arrive after this point, they are "skipped" as well?
In your proposal, you are moving the logic of how to handle the difference between Faulted
and Offline
into another part of the upstairs?
Another piece that I do like with the current model is the observability we get with the state transitions. I do find these useful when debugging.
The documentation for
enum DsState
includes this lovely diagram:It took me a while to wrap my head around the difference between
Offline
andFaulted
:Offline
downstairs comes back online through replay (i.e. sending it every job that it has missed). This means that if we have any offline downstairs in our set, we can't retire jobs which they have not yet seen.Faulted
downstairs comes back online through live-repair (i.e. getting dirty extents from a different downstairs). We can retire jobs that have not been seen by a faulted downstairs.Offline
versusFaulted
is not a property of the actual Downstairs, which is doing its own thing on the other side of the network boundary. Why do we need this distinction at all?Faulted
state because we can't buffer arbitrary numbers of jobs forever in the UpstairsIn other words, the
Offline
→Faulted
transition implements policy choices about how many jobs the Upstairs is allowed to keep around. Framed this way, it's not obvious why this logic should be attached to theDsState
.Instead, I'd like to propose attaching that logic to the
Upstairs
itself:DsState::Faulted
andDsState::Offline
(and deleteDsState::Replay
; it's not doing anything)This would cut through a bunch of gnarly code (e.g. the coupling between backpressure and transitions to
DsState::Faulted
), and seems logically consistent.