feat(update_replication_factor#3): support to set the replication factor for all partitions of each table

empiredan commented 2 years ago

https://github.com/apache/incubator-pegasus/issues/865

foreverneverer commented 2 years ago

I find you use many dassert to validate the update action, I have comment for some. My suggestion is that updating replica count is not a key read-write path. Its failure will not have a real-time impact on the business. You just need to return an error and tell us that it has not succeeded, rather than downtime the machine

hycdong commented 2 years ago

I think it is strange to define two functions called update_partition_max_replica_count, and update_next_partition_max_replica_count. You can update each partition's max replica count through while, such as

for(auto i = 0; i < partition_count; i++){
    update_partition_max_replica_count(i);
}

Besides, I notice that #1072 support update app's max_replica_count, and in this pr, if partition update max_replica_count failed, app's value should be revert. I suggest that you can update the code like:

void update_app_max_replica_count(int new_replica_count) {
    // 1. mark updating replica count to avoid repeated request(this flag should store on zk, you can add a new filed in envs)
    // 2. update each partition replica count 
   //  3. after all partition update succeed, then update app replica count and mark not updating replica count
   //  function server_state::do_app_drop and server_state::process_one_partition may be helpful.
}

XiaoMi / rdsn

feat(update_replication_factor#3): support to set the replication factor for all partitions of each table #1077