tanelpoder / 0xtools

0x.Tools: X-Ray vision for Linux systems
https://0x.tools
GNU General Public License v2.0
1.22k stars 99 forks source link

KeyError on Rocky 9 when running psn #34

Closed dbsid closed 1 year ago

dbsid commented 1 year ago

reproduce steps

yum install -y git make gcc python procps
git clone https://github.com/tanelpoder/0xtools
cd 0xtools/
make && make install

[root@tc-tikv-0 0xtools]# psn -p 1 -G syscall,wchan,filename
Linux Process Snapper v1.2.3 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall, wchan for 5 seconds...
Traceback (most recent call last):
  File "/usr/bin/psn", line 375, in <module>
    main()
  File "/usr/bin/psn", line 301, in main
    task_samples = [s.sample(event_time, pid, task) for s in sources.keys() if s.task_level == True]
  File "/usr/bin/psn", line 301, in <listcomp>
    task_samples = [s.sample(event_time, pid, task) for s in sources.keys() if s.task_level == True]
  File "/usr/lib/0xtools/psnproc.py", line 109, in sample
    return [create_row_sample(rs) for rs in raw_samples]
  File "/usr/lib/0xtools/psnproc.py", line 109, in <listcomp>
    return [create_row_sample(rs) for rs in raw_samples]
  File "/usr/lib/0xtools/psnproc.py", line 105, in create_row_sample
    r =  [event_time, pid, task] + [convert(full_sample[idx]) for idx, convert in self.schema_extract]
  File "/usr/lib/0xtools/psnproc.py", line 105, in <listcomp>
    r =  [event_time, pid, task] + [convert(full_sample[idx]) for idx, convert in self.schema_extract]
  File "/usr/lib/0xtools/psnproc.py", line 383, in <lambda>
    ('syscall',    str,  0, lambda sn: syscall_id_to_name[sn]),  # convert syscall_id via unistd_64.h into call name
KeyError: '45'

[root@tc-tikv-0 0xtools]# cat /etc/redhat-release 
Rocky Linux release 9.1 (Blue Onyx)
tanelpoder commented 1 year ago

Ok interesting. Which architeture is it (intel or arm?). I tested on my RHEL 9.2 env and it was ok (not exactly the same as your env).

The syscall 45 should be recvfrom, but for some reason it wasn't picket up...

can you send the output of:

ls -l /usr/include/asm-generic/unistd.h /usr/include/asm/unistd_64.h /usr/include/x86_64-linux-gnu/asm/unistd_64.h /usr/ include/asm-x86_64/unistd.h /usr/include/asm/unistd.h

grep -H " 45" /usr/include/asm-generic/unistd.h /usr/include/asm/unistd_64.h /usr/include/x86_64-linux-gnu/asm/unistd_64 .h /usr/include/asm-x86_64/unistd.h /usr/include/asm/unistd.h

rpm -ql kernel-headers | grep unistd

dbsid commented 1 year ago

It's intel cpu. Here is the output. Thanks.

[root@tc-tikv-0 /]# ls -l /usr/include/asm-generic/unistd.h /usr/include/asm/unistd_64.h /usr/include/x86_64-linux-gnu/asm/unistd_64.h /usr/ include/asm-x86_64/unistd.h /usr/include/asm/unistd.h

ls: cannot access '/usr/include/x86_64-linux-gnu/asm/unistd_64.h': No such file or directory
ls: cannot access 'include/asm-x86_64/unistd.h': No such file or directory
-rw-r--r-- 1 root root 31321 May 10 01:24 /usr/include/asm-generic/unistd.h
-rw-r--r-- 1 root root   623 May 10 01:24 /usr/include/asm/unistd.h
-rw-r--r-- 1 root root  9716 May 10 01:24 /usr/include/asm/unistd_64.h

/usr/:
total 60
drwxr-xr-x  1 root root 4096 May 24 21:49 bin
drwxr-xr-x  2 root root 4096 May 16  2022 games
drwxr-xr-x  1 root root 4096 May 24 21:49 include
dr-xr-xr-x  1 root root 4096 May 24 21:49 lib
dr-xr-xr-x  1 root root 4096 May 24 21:49 lib64
drwxr-xr-x  1 root root 4096 May 24 21:49 libexec
drwxr-xr-x 12 root root 4096 Feb 16 03:32 local
dr-xr-xr-x  1 root root 4096 May 24 21:49 sbin
drwxr-xr-x  1 root root 4096 May 24 21:49 share
drwxr-xr-x  4 root root 4096 Feb 16 03:32 src
lrwxrwxrwx  1 root root   10 May 16  2022 tmp -> ../var/tmp
[root@tc-tikv-0 /]# grep -H " 45" /usr/include/asm-generic/unistd.h /usr/include/asm/unistd_64.h /usr/include/x86_64-linux-gnu/asm/unistd_64 .h /usr/include/asm-x86_64/unistd.h /usr/include/asm/unistd.h

/usr/include/asm-generic/unistd.h:#define __NR3264_truncate 45
/usr/include/asm-generic/unistd.h:#define __NR_syscalls 450
/usr/include/asm/unistd_64.h:#define __NR_recvfrom 45
grep: /usr/include/x86_64-linux-gnu/asm/unistd_64: No such file or directory
grep: .h: No such file or directory
grep: /usr/include/asm-x86_64/unistd.h: No such file or directory

[root@tc-tikv-0 /]# rpm -ql kernel-headers | grep unistd
/usr/include/asm-generic/unistd.h
/usr/include/asm/unistd.h
/usr/include/asm/unistd_32.h
/usr/include/asm/unistd_64.h
/usr/include/asm/unistd_x32.h
/usr/include/linux/unistd.h
dbsid commented 1 year ago

is it due to the recvfrom is missing in the /usr/include/asm-generic/unistd.h, should we merge the the key-values from all the header files rather then just return from the first available header file?

tanelpoder commented 1 year ago

The solution ended up more complicated (as the asm-generic/unistd.h has different format for some #defines due to old 32-64 bit compat reasons). But the x86 system uses the traditional unistd file/format... so I made the unistd.h file lookup platform specific (currently just aarch64 vs everything else).

Works on my x86_64 RHEL8 clone and aarch64 RHEL9 so far:

tanel@linux01 0xtools]$ sudo psn -p nslookup -a -G syscall,wchan,kstack

Linux Process Snapper v1.2.4 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall, wchan, stack for 5 seconds...
finished.

=== Active Threads =======================================================================================================================================================

 samples | avg_threads | comm          | state                 | syscall       | wchan | kstack                                                                           
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     100 |        1.00 | (isc-socket)  | Sleep (Interruptible) | epoll_wait    | 0     | __x64_sys_epoll_wait()->do_epoll_wait()->ep_poll()                               
     100 |        1.00 | (isc-timer)   | Sleep (Interruptible) | futex         | 0     | __x64_sys_futex()->do_futex()->futex_wait()->futex_wait_queue_me()               
     100 |        1.00 | (isc-worker*) | Sleep (Interruptible) | read          | 0     | ksys_read()->vfs_read()->new_sync_read()->tty_read()->n_tty_read()->wait_woken() 
     100 |        1.00 | (nslookup)    | Sleep (Interruptible) | rt_sigsuspend | 0     | __x64_sys_rt_sigsuspend()->sigsuspend()                                          

samples: 100
(expected: 100)
total processes: 1, threads: 4
runtime: 5.00, measure time: 0.47

aarch64:

[tanel@rhel9 ~]$ sudo psn -p nslookup -G syscall,wchan,kstack -a

Linux Process Snapper v1.2.4 by Tanel Poder [https://0x.tools]
Sampling /proc/stat, syscall, wchan, stack for 5 seconds...
finished.

=== Active Threads =========================================================================================================================================================================================

 samples | avg_threads | comm           | state                 | syscall         | wchan            | kstack                                                                                               
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
     100 |        1.00 | (isc-net-*)    | Sleep (Interruptible) | read            | wait_woken       | __arm64_sys_read()->ksys_read()->vfs_read()->new_sync_read()->tty_read()->n_tty_read()->wait_woken() 
     100 |        1.00 | (isc-socket-*) | Sleep (Interruptible) | epoll_pwait     | ep_poll          | __arm64_sys_epoll_pwait()->do_epoll_wait()->ep_poll()                                                
     100 |        1.00 | (isc-timer)    | Sleep (Interruptible) | futex           | futex_wait_queue | __arm64_sys_futex()->do_futex()->futex_wait()->futex_wait_queue()                                    
     100 |        1.00 | (nslookup)     | Sleep (Interruptible) | rt_sigtimedwait | do_sigtimedwait  | __arm64_sys_rt_sigtimedwait()->do_sigtimedwait()                                                     

samples: 100
(expected: 100)
total processes: 1, threads: 4
runtime: 5.01, measure time: 0.40

Made some other minor improvements too (but will do the release tomorrow).

dbsid commented 1 year ago

verified it's been fix in 1.2.4