MusicPlayerDaemon / mpdscribble

a MPD client which submits information about tracks being played to a scrobbler (e.g. last.fm)
GNU General Public License v2.0
117 stars 15 forks source link

busy `epoll_wait` 100% CPU usage #60

Closed wookietreiber closed 2 months ago

wookietreiber commented 2 months ago
$ mpdscribble -V | head -1
mpdscribble version 0.25

$ pgrep mpdscribble
222341

$ timeout 5s strace -c -p 222341
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00    0.718114           1    363839           epoll_wait
------ ----------- ----------- --------- --------- ----------------
100.00    0.718114           1    363839           total

$ sudo gdb -nh -nx -batch -ex bt -p 222341
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
0x00007002dd1288b2 in epoll_wait () from target:/usr/lib/libc.so.6
#0  0x00007002dd1288b2 in epoll_wait () from target:/usr/lib/libc.so.6
#1  0x000058f25205f1b6 in EpollFD::Wait (this=0x7ffe5913e058, events=0x7ffe5913dea0, maxevents=16, timeout=<optimized out>) at ../src/system/EpollFD.hxx:29
#2  EpollBackend::ReadEvents (this=0x7ffe5913e058, timeout_ms=<optimized out>) at ../src/event/EpollBackend.hxx:41
#3  EventLoop::Wait (this=0x7ffe5913e058, timeout=...) at ../src/event/Loop.cxx:246
#4  EventLoop::Run (this=0x7ffe5913e058) at ../src/event/Loop.cxx:336
#5  Instance::Run (this=0x7ffe5913e050) at ../src/Instance.hxx:34
#6  main (argc=<optimized out>, argv=<optimized out>) at ../src/Main.cxx:188
[Inferior 1 (process 222341) detached]

screenshot-alacritty-2024-07-30-09-37-50

MaxKellermann commented 2 months ago

Can you post a brief strace of mpdscribble while it's busy-looping? (just a few dozen lines is enough, it's repeating all the time anyway) try: strace -p $(pidof mpdscribble) -o /tmp/trace and press Ctrl-C after a second or so. And the output of:

ls -l /proc/`pidof mpdscribble`/fd
cat /proc/`pidof mpdscribble`/fdinfo/3
wookietreiber commented 2 months ago
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3379228280, u64=110710456249976}}], 16, 44748) = 1
$ ls -l /proc/$(pidof mpdscribble)/fd
lr-x------ - umcdev 30 Jul 11:36 0 -> /dev/null
lrwx------ - umcdev 30 Jul 11:36 1 -> socket:[913675]
lrwx------ - umcdev 30 Jul 11:36 2 -> socket:[913675]
lrwx------ - umcdev 30 Jul 11:36 3 -> anon_inode:[eventpoll]
lr-x------ - umcdev 30 Jul 11:36 4 -> pipe:[910678]
l-wx------ - umcdev 30 Jul 16:26 5 -> pipe:[910678]
lrwx------ - umcdev 30 Jul 16:26 6 -> anon_inode:[signalfd]
lrwx------ - umcdev 30 Jul 16:26 7 -> socket:[910686]
lrwx------ - umcdev 30 Jul 16:26 8 -> socket:[910689]
lrwx------ - umcdev 30 Jul 11:36 9 -> socket:[1017357]
lrwx------ - umcdev 30 Jul 16:26 10 -> socket:[1017359]

$ cat /proc/$(pidof mpdscribble)/fdinfo/3
pos:    0
flags:  02000002
mnt_id: 17
ino:    1080
tfd:        8 events:       19 data:     64b0c96aee78  pos:0 ino:de561 sdev:9
tfd:        7 events:       19 data:     7ffc4493f610  pos:0 ino:de55e sdev:9
tfd:        6 events:       19 data:     64b0be6874a8  pos:0 ino:438 sdev:10
MaxKellermann commented 2 months ago

This is file descriptor 8 (110710456249976 = 0x64b0c96aee78). Can you see with ss what socket this is, where it's connected?

MaxKellermann commented 2 months ago

I wonder what all these sockets are. fd1/2 is the systemd-journald socket; but what is 7, 8, 9, 10? I suppose mpdscribble should have at most 1 socket - the HTTP socket for sending API requests. Strangely, fd9+10 are not registered in epoll.

wookietreiber commented 2 months ago

Had to stop the service because it's eating my battery, so here is stuff again :sweat_smile:

$ strace -p $(pidof mpdscribble) |& head
strace: Process 305737 attached
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13376) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13375) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13375) = 1
epoll_wait(3, [{events=EPOLLIN, data={u32=3946381400, u64=107193445187672}}], 16, 13375) = 1

$ ls -l /proc/$(pidof mpdscribble)/fd
lr-x------ - umcdev 30 Jul 17:53 0 -> /dev/null
lrwx------ - umcdev 30 Jul 17:53 1 -> socket:[1036884]
lrwx------ - umcdev 30 Jul 17:53 2 -> socket:[1036884]
lrwx------ - umcdev 30 Jul 17:53 3 -> anon_inode:[eventpoll]
lr-x------ - umcdev 30 Jul 17:53 4 -> pipe:[1043526]
l-wx------ - umcdev 30 Jul 18:04 5 -> pipe:[1043526]
lrwx------ - umcdev 30 Jul 18:04 6 -> anon_inode:[signalfd]
lrwx------ - umcdev 30 Jul 18:04 7 -> socket:[1034093]
lrwx------ - umcdev 30 Jul 18:04 8 -> socket:[1034096]
lrwx------ - umcdev 30 Jul 17:53 9 -> socket:[1047983]
lrwx------ - umcdev 30 Jul 18:04 10 -> socket:[1048979]

$ cat /proc/$(pidof mpdscribble)/fdinfo/3
pos:    0
flags:  02000002
mnt_id: 17
ino:    1080
tfd:        6 events:       19 data:     617dd9a604a8  pos:0 ino:438 sdev:10
tfd:        8 events:       19 data:     617deb390058  pos:0 ino:fc770 sdev:9
tfd:        7 events:       19 data:     7ffdb34ccad0  pos:0 ino:fc76d sdev:9

lsof -p $(pidof mpdscribble) | grep -v -e REG -e CHR -e DIR
COMMAND      PID   USER  FD      TYPE             DEVICE SIZE/OFF     NODE NAME
mpdscribb 305737 umcdev   1u     unix 0x0000000048d1f060      0t0  1036884 type=STREAM (CONNECTED)
mpdscribb 305737 umcdev   2u     unix 0x0000000048d1f060      0t0  1036884 type=STREAM (CONNECTED)
mpdscribb 305737 umcdev   3u  a_inode               0,16        0     1080 [eventpoll:6,7,8]
mpdscribb 305737 umcdev   4r     FIFO               0,15      0t0  1043526 pipe
mpdscribb 305737 umcdev   5w     FIFO               0,15      0t0  1043526 pipe
mpdscribb 305737 umcdev   6u  a_inode               0,16        0     1080 [signalfd]
mpdscribb 305737 umcdev   7u     IPv4            1034093      0t0      TCP localhost:45306->localhost:mshvlm (ESTABLISHED)
mpdscribb 305737 umcdev   8u     IPv4            1034096      0t0      TCP aimob.box:51972->189.19.211.130.bc.googleusercontent.com:https (CLOSE_WAIT)
mpdscribb 305737 umcdev   9u     IPv4            1047983      0t0      TCP aimob.box:51592->189.19.211.130.bc.googleusercontent.com:http (ESTABLISHED)
mpdscribb 305737 umcdev  10u     IPv4            1048979      0t0      TCP aimob.box:51594->189.19.211.130.bc.googleusercontent.com:http (ESTABLISHED)

Why is mpdscribble talking to google?

BTW this is my config:

$ grep -v -e '^#' -e '^$' -e '^password' ~/.config/mpdscribble.conf
verbose = 1
[last.fm]
url = https://post.audioscrobbler.com/
username = wookietreiber
journal = /home/umcdev/.cache/mpdscribble/lastfm.journal

One more info: the busy wait stuff doesn't start immediately, only after a few scrobbles.

MaxKellermann commented 2 months ago

Why is mpdscribble talking to google?

That's post.audioscrobbler.com. 392 IN A 130.211.19.189 which is apparently hosted by Google.

MaxKellermann commented 2 months ago

Which libcurl version is that? This looks like the server has closed the TCP connection, and your kernel keeps telling mpdscribble about it - and mpdscribble then tells libcurl, but libcurl doesn't close the socket and does nothing. This could be a libcurl bug.

wookietreiber commented 2 months ago

Which libcurl version is that?

Archlinux curl 8.9.0-1, just updated

MaxKellermann commented 2 months ago

I tried curl 8.9.0 but cannot reproduce your problem. When the server closes a socket, mpdscribble calls curl, curl reads from the socket, finds it was closed by the server, and unregisters the socket from epoll.

Can you find out with a debugger whether mpdscribble calls curl_multi_socket_action in this busy loop? If yes, what are the parameters and what does it return?

MaxKellermann commented 2 months ago

I was just able to reproduce it with curl 8.9.0, after all. This is a curl bug! mpdscribble does call curl_multi_socket_action(); then that function calls multi_socket(), which looks up the Curl_easy object for the socket; one is found, but its data->conn is NULL, therefore data->state.select_bits doesn't get updated. Nothing really gets done. There is nothing mpdscribble can do to fix this, sorry. Report to the curl project.

wookietreiber commented 2 months ago

Might already be fixed in https://github.com/curl/curl/issues/14280 (8.9.1), waiting for Archlinux to bump curl to confirm.

wookietreiber commented 2 months ago

Might already be fixed in curl/curl#14280 (8.9.1), waiting for Archlinux to bump curl to confirm.

Already out in testing, seems to have fixed it.