xline-kv / Xline

A geo-distributed KV store for metadata management
https://xline.cloud
Apache License 2.0
596 stars 74 forks source link

[Bug]: The calculation methods of quorums in CURP are wrong. #635

Closed Phoenix500526 closed 8 months ago

Phoenix500526 commented 8 months ago

Description about the bug

The calculation methods of quorum and superquorum are as follows:

/// Xline/curp/src/client/unary.rs
/// Calculate the super quorum
fn super_quorum(size: usize) -> usize {
    let fault_tolerance = size.wrapping_div(2);
    fault_tolerance
        .wrapping_add(fault_tolerance.wrapping_add(1).wrapping_div(2))
        .wrapping_add(1)
}

/// Calculate the quorum
fn quorum(size: usize) -> usize {
    size.wrapping_div(2).wrapping_add(1)
}

Actually, for any distributed cluster with a node count of N, the sum of its fault tolerance and quorum should always be equal to N. But the current calculation method for fault tolerance breaks this assumption. This assumption holds true when and only when the N is odd.

For any even number N (N = 2K), the calculation using the above method yields:

Obviously, N = fault tolerance + quorum = 2K + 1 contradicts N being even. The super quorum, derived from the incorrect fault tolerance, also cannot be correct.

The correct calculation methods are as follow:

Version

0.6.0 (Default)

Relevant log output

No response

Code of Conduct