Open yoheiueda opened 5 years ago
You can look at the source to find what is going on :-). For the first one,
waker: 10904
-- --
finish_task_switch
It looks like we do get a non-zero waker pid, but not stack at all. We did not print out "[Missed User Stack]" or "[Missed Kernel Stack]". That means stack id is a valid one. Maybe somehow kernel did get the stack? Could you help print out the stack id in this case?
For your second example missing user stack, again, looks like stack id is valid. Could you help confirm?
Could you check whether the user space applications (i.e., you java app.) has frame pointer support? If it does not the kernel may have a hard time to find a user stack.
Thank you for your comment.
I noticed that offwaketime
shows target and waker PIDs inversely.
https://github.com/iovisor/bcc/blob/v0.10.0/tools/offwaketime.py#L353
print(" %-16s %s %s" % ("waker:", k.waker.decode('utf-8', 'replace'), k.t_pid))
I think k.t_pid
should be k.w_pid
.
Likewise, this line should be corrected. https://github.com/iovisor/bcc/blob/v0.10.0/tools/offwaketime.py#L386
So, waker PID in the first example was actually zero. Waker is not recorded for some reason.
My next question is what causes this situation. What kind of reasons are possible to explain this situation?
I think involuntary context switches due to preemption may be one of the reasons. Are involuntary context switches recorded as waker PID=0?
Thanks for the fix. For missing waker stack, I do have something like
waker: 0
-- --
finish_task_switch
__sched_text_start
schedule_idle
do_idle
cpu_startup_entry
start_secondary
secondary_startup_64
target: swapper/12 0
36893488147420673
For non idle thread, looks like I am able to get waker
information.
The waker info is gathered by kprobe to try_to_wake_up
which is a static funciton.
Could you check in your environment that try_to_wake_up is inlined in some places?
I think involuntary context switches due to preemption may be one of the reasons. Are involuntary context switches recorded as waker PID=0
Why do you think involuntary context switch could be one of reasons? Any reference? pid=0 is reserved for kernel scheduler thread. It can be the waker, e.g.
waker: swapper/49 0
secondary_startup_64
start_secondary
cpu_startup_entry
do_idle
cpuidle_enter
cpuidle_enter_state
apic_timer_interrupt
smp_apic_timer_interrupt
hrtimer_interrupt
__hrtimer_run_queues
hrtimer_wakeup
-- --
finish_task_switch
__sched_text_start
schedule
schedule_hrtimeout_range_clock
poll_schedule_timeout.constprop.14
do_select
core_sys_select
kern_select
__x64_sys_select
do_syscall_64
entry_SYSCALL_64_after_hwframe
[Missed User Stack]
target: offwaketime.py 937576
@yonghong-song Thank you very much for clarifying the meaning of PID=0.
Could you please tell me how to check that try_to_wake_up is inlined or not?
Should I rebuild the kernel to disable inlining of try_to_wake_up
by adding noinline
attribute?
You need to look at assembly code. Find where try_to_wake_up is called in the assembly code. Compare its call site to the place in the source code. And then you can figure out whether it is inlined or not in a particular call site.
I rebuilt the kernel with inlining of try_to_wake_up
disabled as follows.
static noinline int
try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
However, the situation did not change.
Is there any other possibility that causes empty kernel stack for PID=0?
stack id 0 is valid in the kernel, but is ignored in offwaketime.py. Maybe you can try the following patch solves your problem?
diff --git a/tools/offwaketime.py b/tools/offwaketime.py
index b46e9e1b..6a8eb6c8 100755
--- a/tools/offwaketime.py
+++ b/tools/offwaketime.py
@@ -210,6 +210,10 @@ int oncpu(struct pt_regs *ctx, struct task_struct *p) {
key.w_tgid = woke->w_tgid;
__builtin_memcpy(&key.waker, woke->name, TASK_COMM_LEN);
wokeby.delete(&pid);
+ } else {
+ /* waker stack not available, stack id 0 is a valid stack id in the kernel */
+ key.w_k_stack_id = -EFAULT;
+ key.w_u_stack_id = -EFAULT;
}
counts.increment(key, delta);
@@ -307,13 +311,13 @@ for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
has_enomem = has_enomem or (k.w_u_stack_id == -errno.ENOMEM) or \
(k.t_u_stack_id == -errno.ENOMEM)
- waker_user_stack = [] if k.w_u_stack_id < 1 else \
+ waker_user_stack = [] if k.w_u_stack_id < 0 else \
reversed(list(stack_traces.walk(k.w_u_stack_id))[1:])
- waker_kernel_stack = [] if k.w_k_stack_id < 1 else \
+ waker_kernel_stack = [] if k.w_k_stack_id < 0 else \
reversed(list(stack_traces.walk(k.w_k_stack_id))[1:])
- target_user_stack = [] if k.t_u_stack_id < 1 else \
+ target_user_stack = [] if k.t_u_stack_id < 0 else \
stack_traces.walk(k.t_u_stack_id)
- target_kernel_stack = [] if k.t_k_stack_id < 1 else \
+ target_kernel_stack = [] if k.t_k_stack_id < 0 else \
stack_traces.walk(k.t_k_stack_id)
if folded:
@@ -326,7 +330,7 @@ for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
line.extend([b.sym(addr, k.t_tgid).decode('utf-8', 'replace')
for addr in reversed(list(target_user_stack)[1:])])
if not args.user_stacks_only:
- line.extend(["-"] if (need_delimiter and k.t_k_stack_id > 0 and k.t_u_stack_id > 0) else [])
+ line.extend(["-"] if (need_delimiter and k.t_k_stack_id >= 0 and k.t_u_stack_id >= 0) else [])
if stack_id_err(k.t_k_stack_id):
line.append("[Missed Kernel Stack]")
else:
@@ -340,7 +344,7 @@ for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
line.extend([b.ksym(addr).decode('utf-8', 'replace')
for addr in reversed(list(waker_kernel_stack))])
if not args.kernel_stacks_only:
- line.extend(["-"] if (need_delimiter and k.w_u_stack_id > 0 and k.w_k_stack_id > 0) else [])
+ line.extend(["-"] if (need_delimiter and k.w_u_stack_id >= 0 and k.w_k_stack_id >= 0) else [])
if stack_id_err(k.w_u_stack_id):
line.append("[Missed User Stack]")
else:
@@ -358,7 +362,7 @@ for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
for addr in waker_user_stack:
print(" %s" % b.sym(addr, k.w_tgid))
if not args.user_stacks_only:
- if need_delimiter and k.w_u_stack_id > 0 and k.w_k_stack_id > 0:
+ if need_delimiter and k.w_u_stack_id >= 0 and k.w_k_stack_id >= 0:
print(" -")
if stack_id_err(k.w_k_stack_id):
print(" [Missed Kernel Stack]")
@@ -376,7 +380,7 @@ for k, v in sorted(counts.items(), key=lambda counts: counts[1].value):
for addr in target_kernel_stack:
print(" %s" % b.ksym(addr))
if not args.kernel_stacks_only:
- if need_delimiter and k.t_u_stack_id > 0 and k.t_k_stack_id > 0:
+ if need_delimiter and k.t_u_stack_id >= 0 and k.t_k_stack_id >= 0:
print(" -")
if stack_id_err(k.t_u_stack_id):
print(" [Missed User Stack]")
If with the above patch, you still have problems, could you post a new stack trace missing waker stack and the way how to reproduce it? Maybe we should probe other kernel functions as well to get complete sleep/wake pictures.
When I profile a Java workload with
offwaketime
, wakers of Java threads become empty.OpenJDK Runtime Environment AdoptOpenJDK (build 12.0.1+12)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 12.0.1+12, mixed mode, sharing)
https://github.com/jvm-profiling-tools/perf-map-agent
Below is the example pair of target and waker. The target contains frames of JIT-compiled code, but the waker is empty. Even frames in kernel are not included in the waker.
Below is another pair of waker and target in the same profile result. In this case, both waker and target frames are recorded, but the waker is not a Java thread. It looks like target only contains kernel frames.
I am not sure this issue is specific to Java or not. Could anyone please advise how to diagnose this issue?