Closed fe-ax closed 11 months ago
First, let me make sure I understand correctly this issue: you do not experience worse performance (like throughput / latency), but the process seems to be waiting for I/O more than other deployments. Is that correct?
Is WA% always high in this deployment, or is it just during writes to disk? (I see that you're saving RDB every 1 minute).
When you say "Other workloads that use the persistent disk don't show this behaviour" - what are the differences between this deployment and the others? Do they use different disks?
And finally, a few unrelated questions:
--hz=5
?--df_snapshot_format=false
)?Thanks!
Duplicate of #2181 @fe-ax it's a kernel change on how it attributes CPU time in iouring API. Unfortunately, there is nothing much we can do about it but it does not affect anything. It's completely harmless, it's just that an idle CPU that is waiting for networking packet is attributed now as IOWAIT. iouring kernel folks decided at some point that it's better to attribute a cpu blocked on any I/O (even networking) as IOWAIT.
I am surprised that it appeared in kernel 5.10 but 5.10 is a lts kernel version, so maybe they backported this change to there. AFAIK, it first appeared in 6+ kernel versions.
I now googled again kernel discussions about this and learned that they decided to revert the decision because it has been confusing to many users. See here: https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/commit/queue-6.4/io_uring-gate-iowait-schedule-on-having-pending-requests.patch?id=2b8c242ac869eae3d96b712fdb9940e9cd1e0d69
Also here mariadb/mysql folks complaining about this: https://bugzilla.kernel.org/show_bug.cgi?id=217699
working as intended
First, let me make sure I understand correctly this issue: you do not experience worse performance (like throughput / latency), but the process seems to be waiting for I/O more than other deployments. Is that correct?
Is WA% always high in this deployment, or is it just during writes to disk? (I see that you're saving RDB every 1 minute).
When you say "Other workloads that use the persistent disk don't show this behaviour" - what are the differences between this deployment and the others? Do they use different disks?
And finally, a few unrelated questions:
* Why do you use `--hz=5`? * Similarly, why disable Dragonfly's snapshot format (via `--df_snapshot_format=false`)? * May I ask how do you use Dragonfly? With what load, for which purpose, etc?
Thanks!
I ran across this issue the past few days on my home lab k8s cluster where I started getting nagging NodeCPUHighUsage
alerts from Prometheus.
After hours of triage (Because using all other available Linux tools didn't show any high CPU usage) I was able to determine that the alert was reporting CPU being io iowait
and narrowed it down to Dragonfly.
In my case, I'm running a super trivial workload on my home lab so temporarily forced Dragonfly to use epoll using --force_epoll
. Let me be clear that this works in my case where as I stated the workload is trivial and at least I'm no longer getting the Prometheus alerts.
Also observing ~100% IOWAIT on Linux 6.5.0.
@crishoj Parent issue on liburing mentions it will be fixed in kernel 6.10. No idea if the patch will be backported. https://github.com/axboe/liburing/issues/943
Describe the bug A high WA% (Waiting for I/O) time while nothing is happening on the DB. CPU usage is nearly 0%.
To Reproduce Steps to reproduce the behavior:
dragonfly --logtostderr --logtostderr --maxmemory=4gb --save_schedule=*:* --hz=5 --dbfilename dump.rdb --df_snapshot_format=false
Expected behavior Lower WA% when no workload is present.
Screenshots
Environment (please complete the following information):
Linux ip-10-117-39-51.eu-central-1.compute.internal 5.10.198-187.748.amzn2.x86_64 #1 SMP Tue Oct 24 19:49:54 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes
Reproducible Code Snippet N/A
Additional context