MrCroxx / RunKV

[WIP] An experimental cloud-native distributed KV engine for OLTP workload.
MIT License
68 stars 7 forks source link

wait apply notify latency is extremely high #126

Closed MrCroxx closed 2 years ago

MrCroxx commented 2 years ago

The idle time is extremely long while apply duration looks normal. Consider another way to notify apply.

image image
MrCroxx commented 2 years ago

The duration between propose and append_log_entries may be the main reason that cause the high latency.

Consider this request:

The request was proposed at (approximately) 14:25:00.379, but it was appended at 14:25:09.171, whose span almost takes the full part of the request latency.

image

image

MrCroxx commented 2 years ago

The problem may be the biased strategy when selecting proposal/message/ready futures. If there is any proposal/message in the channels, ready cannot be handled.

        // [`Interval`] with default [`MissedTickBehavior::Brust`].
        let mut ticker = tokio::time::interval(RAFT_HEARTBEAT_TICK_DURATION);

        loop {
            let mut handled = !self.raft.has_ready().await;

            tokio::select! {
                biased;

                _ = ticker.tick() => {
                    self.tick().await;
                }

                Some(msg) = self.message_rx.recv() => {
                    self.step(msg).await?;
                }

                Some(proposal) = self.proposal_rx.recv() => {
                    self.propose(proposal).await?;
                }

                true = self.raft.has_ready() => {
                    self.handle_ready().await?;
                    handled = true;
                }
            }

            if !handled {
                tracing::debug!("has ready but skipped by select");
            }
        }
MrCroxx commented 2 years ago

The latencies of the first several requests looks not that bad.

[2022-05-03T12:09:00Z INFO  runkv_wheel::service] txn inner takes: 214.048858ms
[2022-05-03T12:09:00Z INFO  runkv_wheel::service] txn inner takes: 214.313888ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 496.622797ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 496.767884ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 496.874619ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 496.962605ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 497.024285ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 497.095174ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 497.2154ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 497.304463ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 726.521743ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 726.764192ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 726.911805ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 727.030536ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.18851ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.245253ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.355362ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.433439ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.474536ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.509246ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.512452ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.52525ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 722.675ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 716.390985ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 716.392ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 716.459951ms
[2022-05-03T12:09:01Z INFO  runkv_wheel::service] txn inner takes: 716.573696ms

And then the latencies increased to:

[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.778585874s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.776506203s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.77658502s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.775064749s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.769043578s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.766367544s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.765441751s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.761491207s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.753628482s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.753706547s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.751520124s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.748699473s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.748728649s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.745165191s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.745235667s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.742317884s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.737841398s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.734233232s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.731525069s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.731588724s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.72597238s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.724493535s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.724580558s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.722938509s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.71310062s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.708483158s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.708503405s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.706522122s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.616059401s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.577987709s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.348751476s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.344788489s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.324087938s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.324066748s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.306472587s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.302453369s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.289047649s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.289096015s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.287332311s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.283898354s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.506929926s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.482430406s
[2022-05-03T12:09:27Z INFO  runkv_wheel::service] txn inner takes: 3.37454705s
MrCroxx commented 2 years ago

As in #131 , append_log_entires may be the biggest issue for the high latency.

The raft latency in theory is: lat = leader I/O + network RTT + follower I/O + network RTT + fsm apply, and we can see in metrics that the service latency basicly match the formula, append_log_entries latency is the largest item in it.

MrCroxx commented 2 years ago

Currently:

benches:

  1. 10 groups, 100 concurrency.
  2. 10 groups, 1 concurrency.
  3. 1 group, 1 concurrency.
image
MrCroxx commented 2 years ago

On Ubuntu Server LTS 22.04, i7-10700, KBG40ZNS256G NVMe KIOXIA 256GB.

fio fsync latency:

  fsync/fdatasync/sync_file_range:
    sync (usec): min=894, max=36601, avg=1330.50, stdev=872.29
    sync percentiles (usec):
     |  1.00th=[  988],  5.00th=[  996], 10.00th=[ 1037], 20.00th=[ 1057],
     | 30.00th=[ 1090], 40.00th=[ 1106], 50.00th=[ 1139], 60.00th=[ 1156],
     | 70.00th=[ 1172], 80.00th=[ 1221], 90.00th=[ 2638], 95.00th=[ 2737],
     | 99.00th=[ 2868], 99.50th=[ 2966], 99.90th=[ 3523], 99.95th=[19792],
     | 99.99th=[35914]
image
MrCroxx commented 2 years ago

Close.