dragonflydb / dragonfly

A modern replacement for Redis and Memcached
https://www.dragonflydb.io/
Other
25.21k stars 908 forks source link

High Cpu Utilization in Docker #191

Closed jbylund closed 2 years ago

jbylund commented 2 years ago

Describe the bug

Dragonflydb running in docker (managed by yacht) has higher than expected cpu utilization even when there are no clients using the service. I'm almost sure that I'm not giving the docker container the correct permissions or something, but I'm not sure the correct debugging steps.

To Reproduce I think just start under docker and wait?

Expected behavior Close to 0 cpu utilization when there are no clients using service.

Screenshots Snippet from stracing one of the processes:

io_uring_enter(28, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 0
io_uring_enter(28, 0, 0, 0, NULL, 8)    = 0
io_uring_enter(28, 1, 0, 0, NULL, 8)    = 1
io_uring_enter(28, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 0
io_uring_enter(28, 0, 0, 0, NULL, 8)    = 0
io_uring_enter(28, 1, 0, 0, NULL, 8)    = 1
io_uring_enter(28, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 0
io_uring_enter(28, 0, 0, 0, NULL, 8)    = 0
io_uring_enter(28, 1, 0, 0, NULL, 8)    = 1
io_uring_enter(28, 0, 1, IORING_ENTER_GETEVENTS, NULL, 8) = 0
io_uring_enter(28, 0, 0, 0, NULL, 8)    = 0
io_uring_enter(28, 1, 0, 0, NULL, 8)    = 1

Desktop (please complete the following information):

romange commented 2 years ago

can you attach the top screenshot showing high CPU utilization?

romange commented 2 years ago

and report total number of cpus

jbylund commented 2 years ago

6 core / 12 thread i5-10400

Top output:

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
2890639 root      20   0 1655636  11032   6348 S  17.0   0.1   4:15.70 /usr/local/bin/dragonfly --logtostderr  --dbnum

It's not that 17% of one core is super high, it's just always at that level even if not doing anything.

romange commented 2 years ago

it's expected. Dragonfly manages periodic timers that update its expiry data-structures 1000 times a second for each thread. it's controlled by hz flag which is by default 1000. if you want to use ms precision for expiry you should not change the value, otherwise you can reduce it to 100.

jbylund commented 2 years ago

Woah, that's a surprising default behavior, but after revising hz down (significantly) I feel ok leaving it running for a while.

romange commented 2 years ago

I agree it's not ideal. However, I would like to understand how why it bothered you in the first case. 17% /6 is 3% per cpu overhead. you do not pay additional cloud costs for cpu time. so what bothered you? @jbylund

jbylund commented 2 years ago

I'm running locally, so there's some chance that increased cpu means higher power draw, or more acoustic noise. But I think the mental noise it caused for me in system monitoring tooling was the biggest factor.