rabbitmq / ra

A Raft implementation for Erlang and Elixir that strives to be efficient and make it easier to use multiple Raft clusters in a single system.
Other
798 stars 93 forks source link

Add `condition` option to local queries #450

Closed dumbbell closed 1 week ago

dumbbell commented 2 weeks ago

Why

It allows to wait for a condition to become true before the query can be executed.

The condition we need right now is to wait for an index to be applied locally (or on the leader). It is useful when the caller wants to be sure that the result of the previous command is "visible" by the next query.

By default, it's not guarantied because the command will be considered successfully applied as long as a quorum of Ra servers applied it. This list of Ra servers may not include the local node for instance.

How

If the condition option is specified with a {applied, {Index, Term}} tuple, the query will be evaluated right away if that index is already applied, or it will be added to a list of pending queries.

Pending queries are evaluated after each applied batch of commands by the local node. If a pending query's target index was reached or passed, it is evaluated. If a pending query's target term ended, an error is returned.

Note that pending queries that timed out from the callers' point of view will still be evaluated once their associated condition becomes true. The reply will be discarded by Erlang however because the process alias will be inactivate at that point.

Here is an example:

ra:local_query(ServerId, QueryFun, #{condition => {applied, {Index, Term}}}).

The local_query tuple sent to the Ra server changes format. The old one was:

{local_query, QueryFun}

The new one is:

{local_query, QueryFun, Options}

If the remote Ra server that receives the query runs a version of Ra older than the one having this change and thus doesn't understand the new tuple, it will ignore and drop the query. This will lead to a timeout of the query, or an indefinitely hanging call if the timeout was set to infinity.

Note in the opposite situation, i.e. if a Ra server that knows the new query tuple receives an old tuple, it will evaluate the query as if the options was an empty map.

V2: Rename the option from limit to wait_for_index which is more explicit. V3: Rename the option back to limit. It allows to pass other types of condition in the future. Also change the place where pending queries are evaluated. This allows to get rid of the applied_to effect. V4: Rename the option to condition to make its purpose more intuitive. The value was changed to {applied, {Index, Term} to give more meaning to what the condition does. While here, the ra_idxterm() type is aliased to idxterm() and exported as ra:idxterm().

dumbbell commented 2 weeks ago

After submitting the pull request, I admit the parameter name limt isn't really self-explanatory.

Here is what the call looks like:

ra:local_query(ServerId, QueryFun, #{limit => {Index, Term}}).

What about wait_for or even wait_for_index instead? Example:

ra:local_query(ServerId, QueryFun, #{wait_for_index => {Index, Term}}).

Any opinion?

dumbbell commented 1 week ago

I renamed the option from limit to wait_for_index.