Closed panlinux closed 5 years ago
(Updated description changing i386 to ppc64el, I had looked at the wrong test before)
Although the process does not seem to be doing anything (stack wise) based on ptrace() info, and the stack does not change at all:
(c)inaddy@ocfs2arm64:~$ sudo cat /proc/3947/stack [<0>] __switch_to+0xb8/0x1a8 [<0>] (null) [<0>] 0xffffffffffffffff
It is consuming 100% USER time CPU. Which means that, if it had switched to scheduler, no user time would be consumed UNLESS there would be changes in the user stack frames. I'm getting kernel debug symbols so I can see exactly where the CPU is stuck at, but this seems like a kernel issue in this first look.
Is it possible that it has something to do with endian format? ocfs2 tools don't handle well in big endian architecture.
inaddy@ocfs2defrag:~$ sudo cat /proc/1452/stack
[<0>] 0x0
inaddy@ocfs2defrag:~$ while true; do sudo cat /proc/1452/stack; done > bleh.txt
inaddy@ocfs2defrag:~$ cat bleh.txt | sort -u
[<0>] 0x0
Stack does not help me at all (not being updated in execution path from the user<-> kernel context switch). Still, process is consuming100% of user time:
%Cpu3 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1452 root 20 0 1936 428 360 R 100.0 0.0 6:27.67 defragfs.ocfs2
Initial check on time spend scheduling to discard kernel issue like I suspected before:
inaddy@ocfs2defrag:~$ sudo perf sched timehist -p 1452
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
111.555973 [0003] defragfs.ocfs2[1452] 0.000 0.000 0.000
112.123934 [0003] defragfs.ocfs2[1452] 0.029 0.000 567.930
113.571854 [0003] defragfs.ocfs2[1452] 0.018 0.000 1447.901
115.139790 [0003] defragfs.ocfs2[1452] 0.027 0.000 1567.908
115.555743 [0003] defragfs.ocfs2[1452] 0.030 0.000 415.921
116.123707 [0003] defragfs.ocfs2[1452] 0.027 0.000 567.936
116.867670 [0003] defragfs.ocfs2[1452] 0.019 0.000 743.943
117.571630 [0003] defragfs.ocfs2[1452] 0.025 0.000 703.934
118.243597 [0003] defragfs.ocfs2[1452] 0.026 0.000 671.940
118.443583 [0003] defragfs.ocfs2[1452] 0.028 0.000 199.957
119.587530 [0003] defragfs.ocfs2[1452] 0.017 0.000 1143.929
120.123500 [0003] defragfs.ocfs2[1452] 0.023 0.000 535.945
121.571458 [0003] defragfs.ocfs2[1452] 0.016 0.000 1447.941
123.587358 [0003] defragfs.ocfs2[1452] 0.027 0.000 2015.871
124.123365 [0003] defragfs.ocfs2[1452] 0.036 0.000 535.970
125.123284 [0003] defragfs.ocfs2[1452] 0.020 0.000 999.899
125.603259 [0003] defragfs.ocfs2[1452] 0.032 0.000 479.942
126.883212 [0003] defragfs.ocfs2[1452] 0.029 0.000 1279.924
127.587179 [0003] defragfs.ocfs2[1452] 0.028 0.000 703.938
128.123153 [0003] defragfs.ocfs2[1452] 0.027 0.000 535.946
It spends almost no time in waiting for CPU AND absolutely no time in scheduling (as its the only task currently really running), so its something in userland indeed.... debugging it:
(gdb)
#0 0x0000ffffbf618b10 in _getopt_internal_r
(argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 <getopt_data>,
posixly_correct=<optimized out>)
at getopt.c:508
#1 0x0000ffffbf618f0c in _getopt_internal
(argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
longopts=<optimized out>, longind=<optimized out>, long_only=<optimized out>,
posixly_correct=<optimized out>)
at getopt.c:711
#2 0x0000aaaaaaaab43c in parse_opt
(_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585
#3 main (argc=3, argv=0xfffffffff868) at main.c:653
after some time:
(gdb) bt
#0 0x0000ffffbf618950 in _getopt_internal_r
(argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 <getopt_data>,
posixly_correct=<optimized out>)
at getopt.c:500
#1 0x0000ffffbf618f0c in _getopt_internal
(argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
longopts=<optimized out>, longind=<optimized out>, long_only=<optimized out>,
posixly_correct=<optimized out>)
at getopt.c:711
#2 0x0000aaaaaaaab43c in parse_opt
(_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585
#3 main (argc=3, argv=0xfffffffff868) at main.c:653
after some more time:
(gdb) bt
#0 0x0000ffffbf618cec in _getopt_internal_r
(argc=3, argv=0xfffffffff868, optstring=0xaaaaaaaacfc0 "gvclh",
longopts=0x0, longind=0x0, long_only=0, d=d@entry=0xffffbf6cad88 <getopt_data>,
posixly_correct=<optimized out>)
at getopt.c:512
#1 0x0000ffffbf618f0c in _getopt_internal
(argc=<optimized out>, argv=<optimized out>, optstring=<optimized out>,
longopts=<optimized out>, longind=<optimized out>, long_only=<optimized out>,
posixly_correct=<optimized out>)
at getopt.c:711
#2 0x0000aaaaaaaab43c in parse_opt
(_mode_flag=<synthetic pointer>, argv=0xfffffffff868, argc=3) at main.c:585
#3 main (argc=3, argv=0xfffffffff868) at main.c:653
Checking glibc getopt.c code (around lines 500 <-> 512) I can see we are stuck in this loop:
if (d->__nextchar == NULL || *d->__nextchar == '\0')
{
/* Advance to the next ARGV-element. */
...
}
And I'll try to discover why tomorrow (and report upstream).
Alright, this was easier than an issue in glibc... arm64 does not handle (char) overflow, when getting the ret value from getopt(). An overflow to (char) num will be seen as 255 in arm64, instead of -1, like the getopt() from main expects, making the getopt() logic to loop forever.
I'll suggest a small patch shortly.
Hi,
with version 1.8.6, a new tool was made available: defragfs.ocfs2. The debian package added a test for it, and ubuntu runs that test as well. Ubuntu runs it on more architectures, though, and while it passed in less than 1s on amd64, it hangs on arm64 and ppc64el, consuming 99% CPU. No visible I/O is happening (checked with iostat 2). and strace shows just the version being printed to stdout (
ls -la
anddate
are my own debugging, and I changed debian's test a bit to copy/usr/bin
instead of/bin
, which is a symlink in Ubuntu):strace:
A few seconds of
iostat 2
:Kernel is
5.2.0-13-generic
This is the debian commit which added the test,and you can get the full test file from there as well: https://salsa.debian.org/ha-team/ocfs2-tools/commit/4db396488fe4c058bbc2e3b81529595aee3855ef