HewlettPackard / LinuxKI

LinuxKI Toolset (Trace-based performance analysis tool)
Other
244 stars 51 forks source link

UNKNOWN - function / stack trace #54

Closed PeterPitterling closed 1 year ago

PeterPitterling commented 1 year ago

data collected by ./runki -R -a ... what is missing to get anything meaningful for UNKNOWN? SLES15.3 kernel ..

Command line: /opt/linuxki/kiinfo -kiprof nsym=50,npid=20 -ts 0227_0602

kiinfo (7.6)

Linux host1 5.3.18-59.37-default #1 SMP Mon Nov 22 12:29:04 UTC 2021 (d10168e) x86_64 x86_64 x86_64 GNU/Linux

NOTE: idle hardclock traces are not always logged.

******** GLOBAL HARDCLOCK REPORT ********
  Count   USER%    SYS%   INTR%   IDLE%
1139513   2.36%   0.60%   0.00%  97.04%

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kernel Functions executed during profile
   Count     Pct  State  Function
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   27505   2.41%  SYS    UNKNOWN
     351   0.03%  SYS    __schedule
     251   0.02%  SYS    cshook_systemcalltable_pre_compat_sys_ioctl
     241   0.02%  SYS    do_task_stat
     233   0.02%  SYS    lookup_fast
     172   0.02%  SYS    m_sys_openat
     161   0.01%  SYS    enc_threads_add_job
     135   0.01%  SYS    vsnprintf
     132   0.01%  SYS    __handle_mm_fault
     130   0.01%  SYS    cshook_security_inode_free_security
     126   0.01%  SYS    copyout
     115   0.01%  SYS    try_to_wake_up
     100   0.01%  SYS    __audit_syscall_exit
MarkCRay commented 1 year ago

All the USER profiling events (hardclocks) in the Global Hardclock report are shown as "UNKNOWN". You would have to to look at individual processes to see the User function names (see kipid output). Even then, translating the IP address collected by the trace into a function name is not an easy task. There are a few requirements to get this information:

  1. the binary or libraries must not be stripped of symbols (you can use the nm command to verify).
  2. the process needs to still be alived when /proc//maps data is collected. This provides the binary/library names and addresses.

It is odd that the UNKNOWN shows "SYS". It should show "USER", which is what I see in the collections I have. I may need your KI data understand why you see SYS beside of UNKNOWN.

You can contact me directly at mark.ray@hpe.com.

MarkCRay commented 1 year ago

A bit more information. The CPU State when the Hardclock event occurs is determined by the "preempt_cnt".

static inline int get_cpu_state(void arg1) { hardclock_t rec_ptr = arg1; int preempt_cnt = rec_ptr->preempt_count;

    if (preempt_cnt == PREEMPT_USER) {
            return HC_USER;
    } else if (HARDIRQ_VAL(preempt_cnt) > 1 || SOFTIRQ_VAL(preempt_cnt)) {
            return HC_INTR;
    } else if ((rec_ptr->pid == 0)) {
            return HC_IDLE;
    } else {
            return HC_SYS;
    }

}

So i would need to understand what this preempt_count is and see if it is not what I expect to see. I would need to put a printf in the code to show this in the trace data.

MarkCRay commented 1 year ago

The issue was caused by having some kernel addresses in a 3rd party DLKM which could not be found in the kallsyms file. So there were both unknown Kernel symbols and User symbols, and kiinfo was just reporting the state of the last event logged. This will be fixed in the next release of LinuxKI as it now keep trace of both Kernel and User "unknown" symbols. For example:

**** GLOBAL HARDCLOCK REPORT **** Count USER% SYS% INTR% IDLE% 1139513 2.36% 0.60% 0.00% 97.04%

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Kernel Functions executed during profile Count Pct State Function ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 26853 2.36% USER UNKNOWN 652 0.06% SYS UNKNOWN 351 0.03% SYS __schedule 251 0.02% SYS cshook_systemcalltable_pre_compat_sys_ioctl 241 0.02% SYS do_task_stat 233 0.02% SYS lookup_fast 172 0.02% SYS m_sys_openat

MarkCRay commented 1 year ago

Fixed in LinuxKI 7.7, which is now available.