adapt rocksdb benchmark for our system, and generate test result

qinzuoyan commented 8 years ago

We have adapted rocksdb's db_bench to rrdb_bench (refer to https://github.com/imzhenyu/rocksdb/pull/8), and do some test on our servers.

server: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz * 24, Memory 128G, with Hard Disk network: GbE, with ping 0.03~0.05ms

rocksdb test

[rocksdb set] command

bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=2000000; t=1; vs=1000; bs=65536; cs=1048576; of=500000; si=1000; ./rrdb_bench --benchmarks=fillseq --disable_seek_compaction=1 --mmap_read=0 --statistics=0 --histogram=1 --num=$r --threads=$t --key_size=64 --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=./rocksdb_test --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=0 --max_bytes_for_level_base=$bpl --use_existing_db=0

critical config: disbale WAL, disable compression, sync=false [rocksdb set] result

fillseq      :       2.341 micros/op 427101 ops/sec;  433.4 MB/s
Microseconds per op:
Count: 1000000  Average: 2.3413  StdDev: 762.22
Min: 0.0000  Median: 0.6546  Max: 544141.0000
Percentiles: P50: 0.65 P75: 0.98 P99: 3.71 P99.9: 7.94 P99.99: 12.29

[rocksdb get] command

bpl=10485760;overlap=10;mcz=2;del=300000000;levels=6;ctrig=4; delay=8; stop=12; wbn=3; mbc=20; mb=67108864;wbs=134217728; dds=0; sync=0; r=1000000; t=1; vs=1000; bs=65536; cs=1048576; of=500000; si=1000; ./rrdb_bench --benchmarks=readrandom --disable_seek_compaction=1 --mmap_read=0 --statistics=0 --histogram=1 --num=$r --threads=$t --key_size=64 --value_size=$vs --block_size=$bs --cache_size=$cs --bloom_bits=10 --cache_numshardbits=4 --open_files=$of --verify_checksum=1 --db=./rocksdb_test --sync=$sync --disable_wal=1 --compression_type=none --stats_interval=$si --compression_ratio=0.5 --disable_data_sync=$dds --write_buffer_size=$wbs --target_file_size_base=$mb --max_write_buffer_number=$wbn --max_background_compactions=$mbc --level0_file_num_compaction_trigger=$ctrig --level0_slowdown_writes_trigger=$delay --level0_stop_writes_trigger=$stop --num_levels=$levels --delete_obsolete_files_period_micros=$del --min_level_to_compress=$mcz --max_grandparent_overlap_factor=$overlap --stats_per_interval=0 --max_bytes_for_level_base=$bpl --use_existing_db=1

[rocksdb get] result

readrandom   :      25.672 micros/op 38952 ops/sec;   39.5 MB/s (1000000 of 1000000 found)
Microseconds per op:
Count: 1000000  Average: 25.6722  StdDev: 2.72
Min: 2.0000  Median: 25.0420  Max: 690.0000
Percentiles: P50: 25.04 P75: 27.55 P99: 29.96 P99.9: 42.83 P99.99: 151.60

rrdb test

table: 1 table, 1 partition, 3 replica deploy: total 3 replica-servers(1 primary and 2 secondaries) deployed on 3 servers client: 2 rrdb_bench client run "fillseq_rrdb" benchmark concurrently, and at the same time 1 rrdb_bench client run "readrandom_rrdb" benchmark key-size: 64 bytes value-size: 1000 bytes server-config: release mode (-O2), tool=nativerun, logging_level=DEBUG

[rrdb set] result

fillseq_rrdb :    1003.782 micros/op 996 ops/sec;    1.0 MB/s
Microseconds per op:
Count: 1000000  Average: 1003.7814  StdDev: 1746.09
Min: 584.0000  Median: 1029.8601  Max: 971816.0000
Percentiles: P50: 1029.86 P75: 1116.90 P99: 1233.43 P99.9: 8248.40 P99.99: 15826.09

[rrdb get] result

readrandom_rrdb :     236.710 micros/op 4224 ops/sec;    4.3 MB/s (1000000 of 1000000 found)
Microseconds per op:
Count: 1000000  Average: 236.7101  StdDev: 22.08
Min: 188.0000  Median: 229.3401  Max: 9245.0000
Percentiles: P50: 229.34 P75: 244.02 P99: 298.70 P99.9: 389.83 P99.99: 666.67

imzhenyu commented 8 years ago

what is the original number (local rocksdb)?

qinzuoyan commented 8 years ago

@imzhenyu , original number provided as above.

qinzuoyan commented 8 years ago

Another test for SSD

_Deployment_

start 3 replica-server processes on the same machine
create 1 table "test": partition_count=3, replica_count=3

_Machine_ c3-hadoop-tst-pegasus-ssd4-st01.bj (24 core * 2.10GHz, 64GB memory, 4 ssd)

_Build_ $ ./run.sh build -t release

_Test(run in rocksdb/ dir)_ $ ./run.sh start_onebox $ ./dsn.ddlclient config.ini create_app -name test -type rrdb -pc 3 -rc 3 $ ./run.sh bench --app_name test --key_size 16 --value_size 1000 -n 100000

Data on the same HDD

slog_dir = /home/work/pegasus/rocksdb/onebox data_dirs = /home/work/pegasus/rocksdb/onebox

[rrdb set] result

fillseq_rrdb :     333.439 micros/op 2999 ops/sec;    2.9 MB/s
Microseconds per op:
Count: 100000  Average: 333.4382  StdDev: 70.28
Min: 257.0000  Median: 329.1232  Max: 11683.0000
Percentiles: P50: 329.12 P75: 345.92 P99: 425.10 P99.9: 784.00 P99.99: 3250.00

[rrdb get] result

readrandom_rrdb :     121.927 micros/op 8201 ops/sec;    7.9 MB/s (100000 of 100000 found)
Microseconds per op:
Count: 100000  Average: 121.9269  StdDev: 20.91
Min: 85.0000  Median: 116.2685  Max: 1107.0000
Percentiles: P50: 116.27 P75: 132.59 P99: 191.23 P99.9: 243.52 P99.99: 350.00

Data on the same SSD

slog_dir = /home/work/ssd1/pegasus data_dirs = /home/work/ssd1/pegasus

[rrdb set] result

fillseq_rrdb :     328.305 micros/op 3045 ops/sec;    3.0 MB/s
Microseconds per op:
Count: 100000  Average: 328.3046  StdDev: 94.50
Min: 259.0000  Median: 326.1517  Max: 18048.0000
Percentiles: P50: 326.15 P75: 342.50 P99: 398.96 P99.9: 695.45 P99.99: 2666.67

[rrdb get] result

readrandom_rrdb :     120.011 micros/op 8332 ops/sec;    8.1 MB/s (100000 of 100000 found)
Microseconds per op:
Count: 100000  Average: 120.0117  StdDev: 20.70
Min: 84.0000  Median: 114.6198  Max: 1138.0000
Percentiles: P50: 114.62 P75: 129.21 P99: 187.43 P99.9: 241.73 P99.99: 620.00

Data arranged on different SSD

slog_dir = /home/work/ssd1/pegasus data_dirs = /home/work/ssd2/pegasus,/home/work/ssd3/pegasus,/home/work/ssd4/pegasus

[rrdb set] result

fillseq_rrdb :     327.245 micros/op 3055 ops/sec;    3.0 MB/s
Microseconds per op:
Count: 100000  Average: 327.2444  StdDev: 131.86
Min: 260.0000  Median: 325.3862  Max: 26098.0000
Percentiles: P50: 325.39 P75: 341.53 P99: 398.33 P99.9: 562.86 P99.99: 3000.00

[rrdb get] result

readrandom_rrdb :     123.428 micros/op 8101 ops/sec;    7.9 MB/s (100000 of 100000 found)
Microseconds per op:
Count: 100000  Average: 123.4279  StdDev: 34.85
Min: 81.0000  Median: 116.9138  Max: 8099.0000
Percentiles: P50: 116.91 P75: 134.37 P99: 195.20 P99.9: 248.49 P99.99: 760.00

Summary

Device	Set (P99 in us)	Get (P99 in us)
same hdd	425.10	191.23
same ssd	398.96	187.43
diff ssd	398.33	195.20

It seems that we do not benefit better performance from SSD. But I think the reason is that the QPS is too small(the IO is not the bottleneck), and the data scale is also too small(the data may be in memory for read). We'd better increase QPS and data scale for another test.

qinzuoyan commented 8 years ago

More test for SSD

_Deployment_

start 3 replica-server processes on 3 machines (one machine one replica-server)
create 1 table "perftest": partition_count=32, replica_count=3, and _arrange all primary replicas to one replica-server (named as primary-replica-server)_
there are 8 SSDs on each server, we put shared log on ssd1, and arrange replicas' data on ssd2~ssd8.
- slog_dir = /home/work/ssd1/pegasus
- data_dirs = /home/work/ssd2/pegasus, /home/work/ssd3/pegasus, /home/work/ssd4/pegasus, /home/work/ssd5/pegasus, /home/work/ssd6/pegasus, /home/work/s sd7/pegasus, /home/work/ssd8/pegasus

_Machine_ Basic:

24 core * 2.10GHz
64GB Memory
0.7TB * 8 SSD

Network:

Ping: 0.03ms
BandWidth: 9.4Gbits/sec

SSD I/O:

Sequence Write Large Block: bw=473933KB/s, iops=115, runt=8850msec
Sequence Read: bw=370587KB/s, iops=5790, runt=11318msec
Sequence Write Small Block: bw=67554KB/s, iops=16888, runt=31044msec
Random Read Small Block: bw=38983KB/s, iops=9745, runt=53797msec

_Build_ $ ./run.sh build -t release

_Result_ [rrdb set]

./run.sh bench --app_name perftest --key_size 18 --value_size 1000 --thread_num 200 -t fillseq_rrdb -n 100000
fillseq_rrdb :      29.120 micros/op 34340 ops/sec;   33.3 MB/s
Microseconds per op:
Count: 20000000  Average: 5807.4305  StdDev: 5330.17
Min: 478.0000  Median: 4871.1415  Max: 558289.0000
Percentiles: P50: 4871.14 P75: 6998.56 P99: 21620.03 P99.9: 75095.64 P99.99: 116688.93

CPU Usage (typical value on primary-replica-server):

$ top
top - 13:03:14 up 12 days, 13:56,  1 user,  load average: 16.19, 6.02, 2.26
Tasks: 379 total,   1 running, 378 sleeping,   0 stopped,   0 zombie
%Cpu0  : 30.0 us, 17.9 sy,  0.0 ni, 25.3 id,  0.4 wa,  0.0 hi, 26.5 si,  0.0 st
%Cpu1  : 26.9 us, 20.3 sy,  0.0 ni, 39.9 id, 10.8 wa,  0.0 hi,  2.1 si,  0.0 st
%Cpu2  : 34.5 us, 16.6 sy,  0.0 ni, 48.3 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 28.3 us, 18.9 sy,  0.0 ni, 40.6 id, 10.8 wa,  0.0 hi,  1.4 si,  0.0 st
%Cpu4  : 33.8 us, 16.9 sy,  0.0 ni, 48.6 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  : 28.7 us, 18.2 sy,  0.0 ni, 40.6 id, 10.8 wa,  0.0 hi,  1.7 si,  0.0 st
%Cpu6  : 34.0 us, 16.2 sy,  0.0 ni, 49.1 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  : 26.9 us, 18.7 sy,  0.0 ni, 41.7 id, 11.0 wa,  0.0 hi,  1.8 si,  0.0 st
%Cpu8  : 33.8 us, 15.2 sy,  0.0 ni, 49.7 id,  1.0 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu9  : 29.4 us, 18.3 sy,  0.0 ni, 39.4 id, 11.1 wa,  0.0 hi,  1.7 si,  0.0 st
%Cpu10 : 33.4 us, 15.7 sy,  0.0 ni, 49.5 id,  1.4 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 : 28.4 us, 18.2 sy,  0.0 ni, 40.0 id, 11.6 wa,  0.0 hi,  1.8 si,  0.0 st
%Cpu12 : 27.6 us, 13.8 sy,  0.0 ni, 57.9 id,  0.7 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 : 25.2 us, 13.1 sy,  0.0 ni, 55.9 id,  5.2 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu14 : 28.5 us, 14.1 sy,  0.0 ni, 57.0 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 : 24.2 us, 14.5 sy,  0.0 ni, 55.0 id,  5.9 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu16 : 28.4 us, 13.5 sy,  0.0 ni, 57.8 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu17 : 26.9 us, 12.9 sy,  0.0 ni, 54.4 id,  5.1 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu18 : 29.1 us, 13.0 sy,  0.0 ni, 57.5 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 : 24.7 us, 13.2 sy,  0.0 ni, 56.2 id,  5.2 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu20 : 27.8 us, 13.4 sy,  0.0 ni, 58.4 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu21 : 25.1 us, 13.9 sy,  0.0 ni, 54.6 id,  5.8 wa,  0.0 hi,  0.7 si,  0.0 st
%Cpu22 : 28.1 us, 13.6 sy,  0.0 ni, 57.6 id,  0.3 wa,  0.0 hi,  0.3 si,  0.0 st
%Cpu23 : 26.0 us, 12.1 sy,  0.0 ni, 56.4 id,  4.8 wa,  0.0 hi,  0.7 si,  0.0 st
KiB Mem:  65760540 total, 28245740 used, 37514800 free,  1553888 buffers
KiB Swap: 12582908 total,        0 used, 12582908 free. 20458832 cached Mem

Because set() is run in THREAD_POOL_REPLICATION, we set work_count of THREAD_POOL_REPLICATION to 23, so the cpu usage is in uniform distribution.

Network Usage (typical value on primary-replica-server):

$ sar -n DEV 1 100
Linux 3.10.0-123.el7.x86_64 (c3-hadoop-tst-pegasus-ssd8-st06.bj)        01/30/2016      _x86_64_        (24 CPU)

01:04:37 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
01:04:38 PM      eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:04:38 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:04:38 PM      eth2  59696.00  97563.00  64540.71 113804.75      0.00      0.00      0.00
01:04:38 PM      eth3      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:04:38 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00

Because primary replicas receive request from client, and send two pieces of prepare request to secondaries, so sending bandwidth is about two times to receving bandwidth.

SSD Usage (typical value on primary-replica-server):

$ iostat -x 1 -m  
Linux 3.10.0-123.el7.x86_64 (c3-hadoop-tst-pegasus-ssd8-st06.bj)        01/30/2016      _x86_64_        (24 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          30.05    0.00   18.42    3.89    0.00   47.64

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdh               0.00  4923.00    0.00 3725.00     0.00    39.53    21.73     0.22    0.06    0.00    0.06   0.06  21.40
sdi               0.00  4773.00    0.00 3604.00     0.00    38.27    21.75     0.24    0.07    0.00    0.07   0.06  23.20
sdk               0.00  5944.00    0.00 4499.00     0.00    47.88    21.80     0.29    0.06    0.00    0.06   0.06  27.20
sdl               0.00  5906.00    0.00 4471.00     0.00    47.55    21.78     0.27    0.06    0.00    0.06   0.06  26.70
sdm               0.00  5956.00    0.00 4523.00     0.00    48.02    21.74     0.27    0.06    0.00    0.06   0.06  25.20
sdg               0.00  5988.00    0.00 4494.00     0.00    48.05    21.90     0.29    0.06    0.00    0.06   0.06  27.30
sdj               0.00  4711.00    0.00 3562.00     0.00    37.85    21.76     0.23    0.07    0.00    0.07   0.06  22.10
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

[rrdb get]

./run.sh bench --app_name perftest --key_size 18 --value_size 1000 --thread_num 200 -t readrandom_rrdb -n 100000
readrandom_rrdb :       6.873 micros/op 145488 ops/sec;  141.2 MB/s (100000 of 100000 found)
Microseconds per op:
Count: 20000000  Average: 1372.2929  StdDev: 569.42
Min: 126.0000  Median: 1307.0105  Max: 58490.0000
Percentiles: P50: 1307.01 P75: 1458.11 P99: 3237.55 P99.9: 6266.89 P99.99: 13596.15

CPU Usage (typical value on primary-replica-server):

$ top
top - 13:21:36 up 12 days, 14:14,  1 user,  load average: 5.55, 12.18, 10.54
Tasks: 380 total,   1 running, 379 sleeping,   0 stopped,   0 zombie
%Cpu0  : 28.8 us, 15.6 sy,  0.0 ni, 52.4 id,  0.0 wa,  0.0 hi,  3.1 si,  0.0 st
%Cpu1  : 52.0 us,  7.4 sy,  0.0 ni, 40.3 id,  0.3 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu2  : 11.2 us, 12.6 sy,  0.0 ni, 76.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu3  : 51.4 us,  8.4 sy,  0.0 ni, 40.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu4  : 10.2 us, 10.2 sy,  0.0 ni, 79.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu5  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu6  :  9.8 us,  9.1 sy,  0.0 ni, 81.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu7  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu8  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu9  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu10 :  2.3 us,  0.7 sy,  0.0 ni, 97.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu11 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu12 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu13 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu14 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu15 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu16 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu17 :  0.7 us,  0.0 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu18 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu19 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu20 : 52.2 us,  7.4 sy,  0.0 ni, 40.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu21 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu22 : 48.5 us,  6.7 sy,  0.0 ni, 44.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
%Cpu23 :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem:  65760540 total, 19556020 used, 46204520 free,  1556540 buffers
KiB Swap: 12582908 total,        0 used, 12582908 free. 11450888 cached Mem

Because set() is run in THREAD_POOL_LOCAL_APP, we set work_count of THREAD_POOL_LOCAL_APP to 4, so we can see 4 CPUs are in high use.

Network Usage (typical value on primary-replica-server):

$ sar -n DEV 1 100
Linux 3.10.0-123.el7.x86_64 (c3-hadoop-tst-pegasus-ssd8-st06.bj)        01/30/2016      _x86_64_        (24 CPU)

01:32:09 PM     IFACE   rxpck/s   txpck/s    rxkB/s    txkB/s   rxcmp/s   txcmp/s  rxmcst/s
01:32:10 PM      eth0      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:32:10 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:32:10 PM      eth2  15211.00 116880.00  26055.45 167684.33      0.00      0.00      0.00
01:32:10 PM      eth3      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:32:10 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00

SSD Usage (typical value on primary-replica-server):

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.83    0.00    3.33    0.00    0.00   85.84

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdb               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdc               0.00     1.00    0.00    2.00     0.00     0.01    12.00     0.00    0.00    0.00    0.00   0.00   0.00
sde               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdf               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdd               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdh               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdi               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdk               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdl               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdm               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdg               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
sdj               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00    0.00    0.00   0.00   0.00

No read of SSD, I think the reason is cache hitting, which also can explain why QPS is so outstanding.

imzhenyu / rDSN