basho / riak_repl

Riak DC Replication
Apache License 2.0
56 stars 33 forks source link

fullsyncs_completed should not be held in FSM state #629

Open binarytemple opened 10 years ago

binarytemple commented 10 years ago

fullsyncs_completed is not held in a global location, it is passed on the stack.

{fullsyncs_completed, State#state.fullsyncs_completed},

If the riak_repl2_fscoordinator crashes (as it often does), the metric is lost.

This is causing customer confusion, as the metric keeps being reset to zero.

https://github.com/basho/riak_repl/blob/develop/src/riak_repl2_fscoordinator.erl#L242,L272

It would be better if it were stored in some global state, an ETS table or similar.

engelsanchez commented 10 years ago

An ETS table will only help up until the point where the leader changes and the coordinator processes start in another node. That could be a temporary patch, but things like this stored in cluster metadata would be nice.

binarytemple commented 10 years ago

Good point