htop-dev / htop

htop - an interactive process viewer
https://htop.dev/
GNU General Public License v2.0
6.54k stars 441 forks source link

Arithmetic Exception running a statically linked binary #1525

Open programminghoch10 opened 3 months ago

programminghoch10 commented 3 months ago

I built the statically linked binary inside a podman container and copied it to the host. Running it crashes immediately with A signal 8 (Floating point exception) was received.

Compiled on commit 2503239d9fb453b5d67d3b33690c5a8e914bc58c (current main).

Crash Output



FATAL PROGRAM ERROR DETECTED
============================
Please check at https://htop.dev/issues whether this issue has already been reported.
If no similar issue has been reported before, please create a new issue with the following information:
  - Your htop version: '3.4.0-dev-3.3.0-184-g2503239'
  - Your OS and kernel version (uname -a)
  - Your distribution and release (lsb_release -a)
  - Likely steps to reproduce (How did it happen?)
  - Backtrace of the issue (see below)

Error information:
------------------
A signal 8 (Floating point exception) was received.

Setting information:
--------------------
htop_version=3.4.0-dev-3.3.0-184-g2503239;config_reader_min_version=3;fields=0 48 18 54 2 46 47 39 119 113 111 20 53 49 1;hide_kernel_threads=1;hide_userland_threads=1;hide_running_in_container=0;shadow_other_users=1;show_thread_names=0;show_program_path=0;highlight_base_name=1;highlight_deleted_exe=1;shadow_distribution_path_prefix=0;highlight_megabytes=1;highlight_threads=1;highlight_changes=1;highlight_changes_delay_secs=60;find_comm_in_cmdline=1;strip_exe_from_cmdline=1;show_merged_command=0;header_margin=1;screen_tabs=0;detailed_cpu_time=1;cpu_count_from_one=0;show_cpu_usage=1;show_cpu_frequency=1;update_process_names=0;account_guest_in_cpu_meter=0;color_scheme=0;enable_mouse=1;delay=5;hide_function_bar=1;header_layout=two_50_50;column_meters_0=LeftCPUs2 Blank CPU CPU CPU Blank Memory Memory Memory Swap Blank PressureStallCPUSome PressureStallMemorySome PressureStallMemoryFull PressureStallIOSome PressureStallIOFull PressureStallIRQFull;column_meter_modes_0=1 2 3 1 2 2 3 1 2 2 2 2 2 2 2 2 2;column_meters_1=RightCPUs2 Blank DiskIO DiskIO Blank Blank NetworkIO NetworkIO Blank Tasks LoadAverage Uptime System Systemd SystemdUser Hostname DateTime;column_meter_modes_1=1 2 3 2 2 2 3 2 2 2 2 2 2 2 2 2 2;tree_view=0;sort_key=47;tree_sort_key=0;sort_direction=-1;tree_sort_direction=1;tree_view_always_by_pid=1;all_branches_collapsed=0;screen:Main=PID USER NICE SCHEDULERPOLICY STATE PERCENT_CPU PERCENT_MEM M_RESIDENT M_SWAP OOM IO_RATE STARTTIME ELAPSED TIME Command;.sort_key=PERCENT_MEM;.tree_sort_key=PID;.tree_view_always_by_pid=1;.tree_view=0;.sort_direction=-1;.tree_sort_direction=1;.all_branches_collapsed=0;screen:I/O=PID USER IO_PRIORITY IO_RATE IO_READ_RATE IO_WRITE_RATE SYSCR SYSCW RCHAR WCHAR Command;.sort_key=IO_RATE;.tree_sort_key=PID;.tree_view_always_by_pid=0;.tree_view=1;.sort_direction=-1;.tree_sort_direction=1;.all_branches_collapsed=0;

Backtrace information:
----------------------
[0x4080f4]
[0x468ae0]
/lib/x86_64-linux-gnu/libc.so.6(__libc_early_init+0x8b)[0x7f4e00a1416b]
[0x500fc0]
[0x4cb3ca]
[0x5001d5]
[0x4cb3ca]
[0x500557]
[0x4cb4f2]
[0x4cb3ca]
[0x4cb47f]
[0x4cb700]
[0x4c27cf]
[0x4c2ba5]
[0x4c2508]
[0x4bae38]
[0x4ba94f]
[0x41f7a7]
[0x428cb7]
[0x40d84b]
[0x40677e]
[0x45fd04]
[0x461400]
[0x401921]

To make the above information more practical to work with, please also provide a disassembly of your htop binary. This can usually be done by running the following command:

   objdump -d -S -w `which htop` > ~/htop.objdump

Please include the generated file in your report.
Running this program with debug symbols or inside a debugger may provide further insights.

Thank you for helping to improve htop!

Gleitkomma-Ausnahme


GDB full stacktrace


Starting program: /home/jonas/htop 
warning: File "/usr/lib/x86_64-linux-gnu/libthread_db.so.1" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load".
To enable execution of this file add
    add-auto-load-safe-path /usr/lib/x86_64-linux-gnu/libthread_db.so.1
line to your configuration file "/home/jonas/.config/gdb/gdbinit".
To completely disable this security protection add
    set auto-load safe-path /
line to your configuration file "/home/jonas/.config/gdb/gdbinit".
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
    info "(gdb)Auto-loading safe path"
warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.

Program received signal SIGFPE, Arithmetic exception.
0x00007ffff7a8b16b in __nptl_tls_static_size_for_stack () at ../nptl/nptl-stack.h:61
warning: 61 ../nptl/nptl-stack.h: Datei oder Verzeichnis nicht gefunden
#0  0x00007ffff7a8b16b in __nptl_tls_static_size_for_stack () at ../nptl/nptl-stack.h:61
No locals.
#1  __pthread_early_init () at ../sysdeps/nptl/pthread_early_init.h:46
        limit = {rlim_cur = 8388608, rlim_max = 18446744073709551615}
        pagesz = 4096
        minstack = 
        limit = 
        pagesz = 
        minstack = 
#2  __libc_early_init (initial=false) at ./elf/libc_early_init.c:44
No locals.
#3  0x0000000000500fc0 in dl_open_worker_begin ()
No symbol table info available.
#4  0x00000000004cb3ca in _dl_catch_exception ()
No symbol table info available.
#5  0x00000000005001d5 in dl_open_worker ()
No symbol table info available.
#6  0x00000000004cb3ca in _dl_catch_exception ()
No symbol table info available.
#7  0x0000000000500557 in _dl_open ()
No symbol table info available.
#8  0x00000000004cb4f2 in do_dlopen ()
No symbol table info available.
#9  0x00000000004cb3ca in _dl_catch_exception ()
No symbol table info available.
#10 0x00000000004cb47f in _dl_catch_error ()
No symbol table info available.
#11 0x00000000004cb700 in __libc_dlopen_mode ()
No symbol table info available.
#12 0x00000000004c27cf in module_load ()
No symbol table info available.
#13 0x00000000004c2ba5 in __nss_module_get_function ()
No symbol table info available.
#14 0x00000000004c2508 in __nss_next2 ()
No symbol table info available.
#15 0x00000000004bae38 in getpwuid_r ()
No symbol table info available.
#16 0x00000000004ba94f in getpwuid ()
No symbol table info available.
#17 0x000000000041f7a7 in UsersTable_getRef (this=0x58ada0, uid=100032) at UsersTable.c:35
        userData = 
        name = 
#18 0x0000000000428cb7 in LinuxProcessTable_updateUser (mainTask=0x0, procFd=6, process=0x9f5d10, host=0x58b0a0) at linux/LinuxProcessTable.c:532
        sb = {st_dev = 22, st_ino = 377692661, st_nlink = 9, st_mode = 16749, st_uid = 100032, st_gid = 100032, __pad0 = 0, st_rdev = 0, st_size = 0, st_blksize = 1024, st_blocks = 0, st_atim = {tv_sec = 1723817858, tv_nsec = 243709567}, st_mtim = {tv_sec = 1723817858, tv_nsec = 243709567}, st_ctim = {tv_sec = 1723817858, tv_nsec = 243709567}, __glibc_reserved = {0, 0, 0}}
        statok = 
        sb = 
        statok = 
#19 LinuxProcessTable_recurseProcTree (this=0x58b4b0, parentFd=, lhost=0x58b0a0, dirname=, mainTask=0x0) at linux/LinuxProcessTable.c:1599
        pid = 
        procFd = 6
        proc = 
        statCommand = "nginx\000\000\000\000ort\000\000n\000ockd\000recover\000\000\000\000\360\227X\000\000\000\000\000\000\331\377\377\377\177\000\000\230xR", '\000' , "\377\377\377\377\377\177\000\000\002", '\000' , "0 0 0 0 \002\000\000\000\001\000\000\000\260\247X\000\000\000\000\000\000\000\000\000\001\000\000\000\200\326W\000\000\000\000\000"
        lasttimes = 0
        lp = 
        name = 
        preExisting = false
        scanMainThread = 
        last_tty_nr = 
        pt = 0x58b4b0
        host = 0x58b0a0
        settings = 0x58d000
        ss = 0x58f490
        entry = 
        dirFd = 5
        dir = 
        hideKernelThreads = true
        hideUserlandThreads = true
        hideRunningInContainer = false
        __PRETTY_FUNCTION__ = "LinuxProcessTable_recurseProcTree"
#20 0x000000000040d84b in Machine_scanTables (this=this@entry=0x58b0a0) at Machine.c:122
        table = 0x58b4b0
        i = 0
        firstScanDone = true
        __PRETTY_FUNCTION__ = "Machine_scanTables"
#21 0x000000000040677e in CommandLine_run (argc=, argv=) at CommandLine.c:399
        lc_ctype = 
        status = STATUS_OK
        flags = {pidMatchList = 0x0, commFilter = 0x0, userId = 4294967295, sortKey = 0, delay = -1, iterationsRemaining = -1, useColors = true, enableMouse = true, treeView = false, allowUnicode = true, highlightChanges = false, highlightDelaySecs = -1, readonly = false}
        ut = 0x58ada0
        dm = 0x0
        dc = 0x58af30
        ds = 0x0
        host = 0x58b0a0
        pt = 0x58b4b0
        settings = 0x58d000
        header = 0x58e170
        panel = 0x7a6a30
        state = {host = 0x58b0a0, mainPanel = 0x7a6a30, header = 0x58e170, pauseUpdate = false, hideSelection = false, hideMeters = false}
        scr = 0x58f2a0
#22 0x000000000045fd04 in __libc_start_call_main ()
No symbol table info available.
#23 0x0000000000461400 in __libc_start_main_impl ()
No symbol table info available.
#24 0x0000000000401921 in _start ()
No symbol table info available.
quit


htop.objdump

gdb stacktrace as text file

BenBE commented 3 months ago

Going off of frame 17 in UserTable.c:35, the crash happens inside of getpwuid, which calls getpwuid_r which in turn follows up with a call into libnss.

A quick search found a similar issue in libc 2.34+, which can be seen here. Another instance of a similar issue can be seen here.

Can you check if the issue persists when compiling htop without statically linking?

programminghoch10 commented 3 months ago

Dynamically linked works fine.

fasterit commented 3 months ago

Systemd present? Cf. https://github.com/htop-dev/htop/issues/503#issuecomment-826007195

programminghoch10 commented 3 months ago

Systemd present?

Yes, systemd is present on host where the fault happend, but not in the container i built it in.

fasterit commented 2 months ago

Only the host where you run it is relevant as htop tries to use getpwuid and that gets resolved via NSS and that fails with nss-systemd present.

fasterit commented 2 months ago

duplicate of #503

cgzones commented 2 months ago

You could either try the patch from #1527 or build against musl, e.g. via this script.