dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
24.76k stars 892 forks source link

The CPU consumption of dragonfly seems to be high without any opetation. #139

Closed Ansore closed 2 years ago

Ansore commented 2 years ago

The CPU consumption of dragonfly seems to be high without any opetation.

operater system: archlinux kernel version: 5.18.3-arch1-1 build:

./helio/blaze.sh
cd build-dbg
ninja dragonfly
./dragonfly --alsologtostderr

htop: Screenshot_20220612_154641

top of dragonfly process:

top - 16:39:04 up  7:48,  1 user,  load average: 1.79, 1.76, 2.10
Threads:  13 total,   0 running,  13 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.7 us,  0.8 sy,  0.0 ni, 96.0 id,  0.0 wa,  0.4 hi,  0.1 si,  0.0 st
MiB Mem :  31769.7 total,  10458.6 free,   9525.9 used,  11785.2 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.  17303.0 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                
 101470 ansore    20   0 1376820  47540  25124 S   3.0   0.1   1:16.19 Proactor0                                                              
 101473 ansore    20   0 1376820  47540  25124 S   3.0   0.1   1:12.42 Proactor3                                                              
 101474 ansore    20   0 1376820  47540  25124 S   3.0   0.1   1:12.61 Proactor4                                                              
 101476 ansore    20   0 1376820  47540  25124 S   3.0   0.1   1:15.83 Proactor6                                                              
 101478 ansore    20   0 1376820  47540  25124 S   3.0   0.1   1:13.12 Proactor8                                                              
 101471 ansore    20   0 1376820  47540  25124 S   2.7   0.1   1:21.63 Proactor1                                                              
 101472 ansore    20   0 1376820  47540  25124 S   2.7   0.1   1:12.25 Proactor2                                                              
 101475 ansore    20   0 1376820  47540  25124 S   2.7   0.1   1:12.32 Proactor5                                                              
 101477 ansore    20   0 1376820  47540  25124 S   2.7   0.1   1:16.11 Proactor7                                                              
 101479 ansore    20   0 1376820  47540  25124 S   2.7   0.1   1:12.71 Proactor9                                                              
 101480 ansore    20   0 1376820  47540  25124 S   2.3   0.1   1:12.95 Proactor10                                                             
 101469 ansore    20   0 1376820  47540  25124 S   0.0   0.1   0:00.00 dragonfly                                                              
 101481 ansore    20   0 1376820  47540  25124 S   0.0   0.1   0:00.00 Proactor11

the dragonfly process's backtrace:

(gdb) info thread
  Id   Target Id                                       Frame 
* 1    Thread 0x7f5dc4aff480 (LWP 101469) "dragonfly"  0x00007f5dc3889119 in ?? () from /usr/lib/libc.so.6
  2    Thread 0x7f5dc4681640 (LWP 101470) "Proactor0"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  3    Thread 0x7f5dc43ff640 (LWP 101471) "Proactor1"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  4    Thread 0x7f5dc43ae640 (LWP 101472) "Proactor2"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  5    Thread 0x7f5dc4339640 (LWP 101473) "Proactor3"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  6    Thread 0x7f5dc42b6640 (LWP 101474) "Proactor4"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  7    Thread 0x7f5dc3be5640 (LWP 101475) "Proactor5"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  8    Thread 0x7f5dc3b62640 (LWP 101476) "Proactor6"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  9    Thread 0x7f5dc3acc640 (LWP 101477) "Proactor7"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  10   Thread 0x7f5dc3a50640 (LWP 101478) "Proactor8"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  11   Thread 0x7f5dc37c4640 (LWP 101479) "Proactor9"  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  12   Thread 0x7f5dc3762640 (LWP 101480) "Proactor10" 0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
  13   Thread 0x7f5dc3707640 (LWP 101481) "Proactor11" 0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 0x7f5dc4681640 (LWP 101470))]
#0  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007f5dc390b30d in syscall () from /usr/lib/libc.so.6
#1  0x000055d9749f01de in __sys_io_uring_enter2 (fd=<optimized out>, to_submit=<optimized out>, min_complete=<optimized out>, 
    flags=<optimized out>, sig=<optimized out>, sz=<optimized out>) at syscall.c:64
#2  0x000055d9749efd32 in _io_uring_get_cqe (ring=ring@entry=0x55d976a79338, cqe_ptr=cqe_ptr@entry=0x7f5dc4678428, 
    data=data@entry=0x7f5dc46783b0) at queue.c:122
#3  0x000055d9749effe0 in __io_uring_get_cqe (sigmask=0x0, wait_nr=1, submit=<optimized out>, cqe_ptr=0x7f5dc4678428, ring=0x55d976a79338)
    at queue.c:150
#4  io_uring_wait_cqes (ring=0x55d976a79338, cqe_ptr=0x7f5dc4678428, wait_nr=1, ts=<optimized out>, sigmask=0x0) at queue.c:308
#5  0x000055d9749bc558 in util::uring::(anonymous namespace)::wait_for_cqe (ring=0x55d976a79338, wait_nr=1, sig=0x0)
    at /home/ansore/CLionProjects/dragonfly/helio/util/uring/proactor.cc:57
#6  0x000055d9749c46e1 in util::uring::Proactor::Run (this=0x55d976a79160)
    at /home/ansore/CLionProjects/dragonfly/helio/util/uring/proactor.cc:354
#7  0x000055d974c67908 in operator() (__closure=0x55d976a778b0) at /home/ansore/CLionProjects/dragonfly/helio/util/proactor_pool.cc:160
#8  0x000055d974c6af53 in std::__invoke_impl<void, util::ProactorPool::SetupProactors()::<lambda()>&>(std::__invoke_other, struct {...} &) (
    __f=...) at /usr/include/c++/12.1.0/bits/invoke.h:61
#9  0x000055d974c6aa29 in std::__invoke_r<void, util::ProactorPool::SetupProactors()::<lambda()>&>(struct {...} &) (__fn=...)
    at /usr/include/c++/12.1.0/bits/invoke.h:111
#10 0x000055d974c6a2dd in std::_Function_handler<void(), util::ProactorPool::SetupProactors()::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/12.1.0/bits/std_function.h:290
#11 0x000055d974953f77 in std::function<void ()>::operator()() const (this=0x55d976a778b0) at /usr/include/c++/12.1.0/bits/std_function.h:591
#12 0x000055d974cdc3ad in base::start_cpp_function (arg=0x55d976a778b0)
    at /home/ansore/CLionProjects/dragonfly/helio/base/pthread_utils.cc:14
#13 0x00007f5dc388c54d in ?? () from /usr/lib/libc.so.6
#14 0x00007f5dc3911874 in clone () from /usr/lib/libc.so.6
(gdb) f 5
#5  0x000055d9749bc558 in util::uring::(anonymous namespace)::wait_for_cqe (ring=0x55d976a79338, wait_nr=1, sig=0x0)
    at /home/ansore/CLionProjects/dragonfly/helio/util/uring/proactor.cc:57
57        int res = io_uring_wait_cqes(ring, &cqe_ptr, wait_nr, NULL, sig);

Is this cased by Proactor idling that loads to execessive CPU usage?

romange commented 2 years ago

can you please paste here your /proc/cpuinfo? what happens if you run it with --hz=10? does CPU usage get lower?

Ansore commented 2 years ago

thank you for your reply.

here is cat /proc/cpuinfo:

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 165
model name      : Intel(R) Core(TM) i5-10400 CPU @ 2.90GHz
stepping        : 3
microcode       : 0xf0
cpu MHz         : 4300.000
cache size      : 12288 KB
physical id     : 0
siblings        : 12
core id         : 0
cpu cores       : 6
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts md_clear flush_l1d arch_capabilities
vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs pml ept_mode_based_exec
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit
bogomips        : 5802.42
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 165
model name      : Intel(R) Core(TM) i5-10400 CPU @ 2.90GHz
stepping        : 3
microcode       : 0xf0
cpu MHz         : 4300.000
cache size      : 12288 KB
physical id     : 0
siblings        : 12
core id         : 1
cpu cores       : 6
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 22
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts md_clear flush_l1d arch_capabilities
vmx flags       : vnmi preemption_timer posted_intr invvpid ept_x_only ept_ad ept_1gb flexpriority apicv tsc_offset vtpr mtf vapic ept vpid unrestricted_guest vapic_reg vid ple shadow_vmcs pml ept_mode_based_exec
bugs            : spectre_v1 spectre_v2 spec_store_bypass swapgs itlb_multihit
bogomips        : 5802.42
clflush size    : 64
cache_alignment : 64
address sizes   : 39 bits physical, 48 bits virtual
power management:

# 12 processor

if I run dragonfly with --hz=10, CPU usage wile be negligible.

top - 22:56:02 up 14:05,  1 user,  load average: 1.05, 1.42, 1.49
Threads:  13 total,   0 running,  13 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.7 us,  0.2 sy,  0.0 ni, 99.1 id,  0.1 wa,  0.0 hi,  0.0 si,  0.0 st
top - 22:58:03 up 14:07,  1 user,  load average: 0.91, 1.21, 1.40
Threads:  13 total,   0 running,  13 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.2 us,  0.2 sy,  0.0 ni, 98.4 id,  0.1 wa,  0.1 hi,  0.0 si,  0.0 st
MiB Mem :  31769.7 total,   4604.4 free,  11940.3 used,  15225.0 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.  14235.5 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                
 137699 ansore    20   0 1368624  37088  24896 S   0.3   0.1   0:00.15 Proactor2                                                              
 137702 ansore    20   0 1368624  37088  24896 S   0.3   0.1   0:00.15 Proactor5                                                              
 137703 ansore    20   0 1368624  37088  24896 S   0.3   0.1   0:00.16 Proactor6                                                              
 137707 ansore    20   0 1368624  37088  24896 S   0.3   0.1   0:00.15 Proactor10                                                             
 137696 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.01 dragonfly                                                              
 137697 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.16 Proactor0                                                              
 137698 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.16 Proactor1                                                              
 137700 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.14 Proactor3                                                              
 137701 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.14 Proactor4                                                              
 137704 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.15 Proactor7                                                              
 137705 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.15 Proactor8                                                              
 137706 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.15 Proactor9                                                              
 137708 ansore    20   0 1368624  37088  24896 S   0.0   0.1   0:00.00 Proactor11 

So we need to set the appropriate clock frequency?

romange commented 2 years ago

DF runs a periodic task handler, a cron job 1000 times a second on each thread. This cron job is pretty lightweight but its high frequency causes CPU usage to be somewhat high. The fact that you run it in debug mode amplifies the problem as well. It's not important for production workloads. For localdev environments you can pass --hz=10 or --hz=100 to alleviate the problem.

romange commented 2 years ago

Duplicate of https://github.com/dragonflydb/dragonfly/issues/66

Ansore commented 2 years ago

Thank you.