Closed kolesnikovae closed 2 weeks ago
Thank you so much for the review, Aleks!
The only part I am not sure about is how clear some of the new concepts are. In particular, we have the
raft_node.Leader
andraft_node.Follower
types. Their main purpose is ensuring read consistency, but their names can be misleading because they are active on both leader and follower nodes.
Interestingly, I actually find these types intuitive and quite like them: the types encapsulate behavior rather than role and satisfy the corresponding interfaces:
type RaftLeader interface {
ReadIndex() (uint64, error)
}
type RaftFollower interface {
WaitLeaderCommitIndexAppliedLocally(ctx context.Context) (applied uint64, err error)
ConsistentRead(ctx context.Context, read func(*bbolt.Tx)) error
}
For example, I really like that follower.ConsistentRead
might be called on the leader (keep in mind that followers may be ahead of the leader in terms of applied changes, so the leader must also perform this check). This makes the Follower Reads pattern quite obvious.
Similarly, leader.ReadIndex
can be called on any node: if the node is not elected by the quorum, the method fails; otherwise, it succeeds.
There is also a strong connection between the two, with the follower being the only consumer of what the leader provides (
ReadIndex
). I wonder if merging them under aStateReader
(orStateProvider
) is possible and could make their purpose more clear.
You've clearly described the two components, how they interact, and the principle behind their connection: they indeed have a strong leader-follower connection and distinct responsibilities. I wouldn't want to obscure this from the user; in the current implementation, each component has a single responsibility.
But I hear you – I'll try to find a better approach. I'm not entirely sure about stateReader.ReadIndex
(vs. leader.ReadIndex
) or stateProvider.ConsistentRead
(vs. follower.ConsistentRead
), but I'll give it a try.
I decided to split https://github.com/grafana/pyroscope/pull/3652 into multiple PRs to avoid bad merge conflicts. In this PR, I refactor the FSM module and gRPC services. The core idea: strict raft command handler interface with storage/FSM injection (
*bbolt.Tx
for now). This required quite a few adjustments, but now we always perform all the operations in a single transaction and the state is explicitly provided to the command handler.I tried to preserve as much as possible but decided to not re-write compactor tests as the implementation is changed significantly in https://github.com/grafana/pyroscope/pull/3652.
I'm going to test the PR more extensively (there almost no functional changes), but it is ready for review.