aristocratos / btop

A monitor of resources
Apache License 2.0
20.66k stars 635 forks source link

[BUG] Floating point exception issue since updating to linux kernel 6.6 #662

Open putridpete opened 11 months ago

putridpete commented 11 months ago

Description btop fails to start, and prints out the error message "floating point exception (core dumped)" after updating the zen kernel to 6.6 on Arch. Booting with the LTS kernel (6.1) makes btop work again.

To Reproduce

I'm not sure, just that after updating the zen kernel to 6.6 provided by Arch Linux, btop stops launching altogether with the above message.

Expected behavior

To work as usual.

Info

Additional context

contents of ~/.config/btop/btop.log

2023/11/10 (00:58:27) | ===> btop++ v.1.2.13
2023/11/10 (00:58:27) | DEBUG: Starting in DEBUG mode!
2023/11/10 (00:58:27) | INFO: Logger set to DEBUG
2023/11/10 (00:58:27) | DEBUG: Using locale en_US.UTF-8
2023/11/10 (00:58:27) | INFO: Running on /dev/pts/0
2023/11/10 (00:58:27) | DEBUG: Loading theme file: /home/peter/.config/btop/themes/dracula.theme

GDB Backtrace

The program did not crash when I ran it with gdb, but displayed the message Thread 2 "btop" received signal SIGFPE, Arithmetic exception. [Switching to Thread 0x7ffff75ff6c0 (LWP 14759)] 0x00005555555ba64f in ?? () without making any progress.

Note: the issue seems similar to this one, but unlike that report, I've never compiled btop before.

imwints commented 11 months ago

I can't reproduce the crash on Arch.

If the program crashes again you can access the current core file by running coredumpctl debug btop, than you can get a full back trace with thread apply all bt.

I suspect this doesn't help since the binary from arch has all debug symbols removed. Please try to compile the latest git version with debug symbols and see if it crashes, then follow the above steps to get a back trace

putridpete commented 11 months ago

Hey there, thanks for the response.

I've compiled and installed from source with debug symbols enabled and after running your instructions, gdb ouputs the following:

[Thread debugging using libthread_db enabled]                                                                                                                                                
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `btop'.
Program terminated with signal SIGFPE, Arithmetic exception.

warning: Section `.reg-xstate/22849' in core file too small.
#0  0x000055cc030694bf in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:669
669             const auto& temp_color = Theme::g("temp").at(clamp(cpu.temp.at(0).back() * 100 / cpu.temp_max, 0ll, 100ll));
[Current thread is 1 (Thread 0x7f0c58dfa6c0 (LWP 22849))]
(gdb) set logging enabled on
Copying output to gdb.txt.
Copying debug output to gdb.txt.
(gdb) thread apply all bt

Thread 2 (Thread 0x7f0c5bf00740 (LWP 22843)):
warning: Section `.reg-xstate/22843' in core file too small.
#0  0x00007f0c5ba067f5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7ffdb88049a0, rem=0x7ffdb88049a0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007f0c5ba188c7 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x000055cc030fd166 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/13.2.1/bits/this_thread_sleep.h:80
#3  Tools::sleep_ms (ms=<optimized out>, ms=<optimized out>) at src/btop_tools.hpp:269
#4  Tools::atomic_wait_for(std::atomic<bool> const&, bool, unsigned long) [clone .constprop.0] (old=false, wait_ms=10, atom=...) at src/btop_tools.cpp:488
#5  0x000055cc0301e459 in main (argc=<optimized out>, argv=<optimized out>) at src/btop.cpp:1006

Thread 1 (Thread 0x7f0c58dfa6c0 (LWP 22849)):
#0  0x000055cc030694bf in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:669
#1  0x000055cc0303efc4 in Runner::_runner () at src/btop.cpp:515
#2  0x00007f0c5b9bd9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#3  0x00007f0c5ba417cc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

Let me know if you require any further information or if I've done something incorrectly.

Thanks once again.

imwints commented 11 months ago

There is probably a division by zero. The back trace hints it's on line 669 in src/btop_draw.cpp. To verify that you could step through the program with gdb or apply this patch and recompile it.

--- a/src/btop_draw.cpp
+++ b/src/btop_draw.cpp
@@ -19,6 +19,7 @@ tab-size = 4
 #include <array>
 #include <algorithm>
 #include <cmath>
+#include <iostream>
 #include <ranges>
 #include <string>

@@ -29,6 +30,7 @@ tab-size = 4
 #include "btop_tools.hpp"
 #include "btop_input.hpp"
 #include "btop_menu.hpp"
+#include "fmt/core.h"

 using std::array;
 using std::clamp;
@@ -666,6 +668,7 @@ namespace Cpu {
            + Theme::g("cpu").at(clamp(cpu.cpu_percent.at("total").back(), 0ll, 100ll)) + rjust(to_string(cpu.cpu_percent.at("total").back()), 4) + Theme::c("main_fg") + '%';
        if (show_temps) {
            const auto [temp, unit] = celsius_to(cpu.temp.at(0).back(), temp_scale);
+           fmt::print(std::cerr, "cpu.temp_max: {}\n", cpu.temp_max);
            const auto& temp_color = Theme::g("temp").at(clamp(cpu.temp.at(0).back() * 100 / cpu.temp_max, 0ll, 100ll));
            if (b_column_size > 1 or b_columns > 1)
                out += ' ' + Theme::c("inactive_fg") + graph_bg * 5 + Mv::l(5) + temp_color

This should print a value in the terminal.

putridpete commented 11 months ago

Not sure if I did it properly but after patching the file and recompiling, it crashed with the same error. Running it through gdb using the same steps again gave the following information:

[Thread debugging using libthread_db enabled]                                                  
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `btop'.
Program terminated with signal SIGFPE, Arithmetic exception.

warning: Section `.reg-xstate/15897' in core file too small.
#0  0x000056317bdaf3ea in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., 
    force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:672
672             const auto& temp_color = Theme::g("temp").at(clamp(cpu.temp.at(0).back() * 100 / cpu.temp_max, 0ll, 100ll));
[Current thread is 1 (Thread 0x7fb5de7fc6c0 (LWP 15897))]
(gdb) thread apply all bt

Thread 2 (Thread 0x7fb5e5b18740 (LWP 15891)):
warning: Section `.reg-xstate/15891' in core file too small.
#0  0x00007fb5e56067f5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7ffe4438dd10, rem=0x7ffe4438dd10) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007fb5e56188c7 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x000056317be42f96 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/13.2.1/bits/this_thread_sleep.h:80
#3  Tools::sleep_ms (ms=<optimized out>, ms=<optimized out>) at src/btop_tools.hpp:269
#4  Tools::atomic_wait_for(std::atomic<bool> const&, bool, unsigned long) [clone .constprop.0] (old=false, wait_ms=10, atom=...) at src/btop_tools.cpp:488
#5  0x000056317bd64439 in main (argc=<optimized out>, argv=<optimized out>) at src/btop.cpp:1006

Thread 1 (Thread 0x7fb5de7fc6c0 (LWP 15897)):
#0  0x000056317bdaf3ea in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:672
#1  0x000056317bd84ee4 in Runner::_runner () at src/btop.cpp:515
#2  0x00007fb5e55bd9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#3  0x00007fb5e56417cc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
imwints commented 11 months ago

Did you catch any output on your console? There should be a line like cpu.temp_max: 0 AFTER btop closes. If that's not the case you can redirect the output to a file like btop 2> file and then post the contents of file

putridpete commented 11 months ago

Yes, btop 2> file actually results in cpu.temp_max: 0 when I cat file. I missed it the first time around because when btop crashes it leaves behind some graphical bits that won't go away until I clear, and I'm guessing that obscures some of the text ouput after it crashes.

imwints commented 11 months ago

Yes, you can you also call reset to restore the terminals state in some uglier cases. btop sends some escape sequences to your terminal that e.g.: Remove the cursor and flip to a back buffer, so it is 'scrollable'. If btop crashes unexpectedly it doesn't restore the terminal on it's own...

Ok, I'll prepare a patch to work around this. But your processor seems to not properly report a critical temperature to the system

putridpete commented 11 months ago

Thank you. Is there any idea why it works on the lts kernel as opposed to the zen one? I experience no crashes there with btop.

imwints commented 11 months ago

The temperature is reported by a kernel module (for me it's called k10temp, AMD Cpu). It might be the case that this kernel module was updated either from vanilla to zen or from 6.1 to 6.6 (both seem unlikely). Do other temperature values in btop make sense?

imwints commented 11 months ago

Please try this branch. It shouldn't crash. It will also print some sensor information to stderr (like before)

putridpete commented 11 months ago

Unfortunately, that one also crashed for me:

[Thread debugging using libthread_db enabled]                                                                                                                                                
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `btop'.
Program terminated with signal SIGFPE, Arithmetic exception.

warning: Section `.reg-xstate/5721' in core file too small.
#0  0x0000561e48ba34bf in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:669
669             const auto& temp_color = Theme::g("temp").at(clamp(cpu.temp.at(0).back() * 100 / cpu.temp_max, 0ll, 100ll));
[Current thread is 1 (Thread 0x7f10399fa6c0 (LWP 5721))]
(gdb) thread apply all bt

Thread 2 (Thread 0x7f103ca88740 (LWP 5715)):
warning: Section `.reg-xstate/5715' in core file too small.
#0  0x00007f103c5e17f5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7ffc75a42eb0, rem=0x7ffc75a42eb0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007f103c5f38c7 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x0000561e48c37166 in std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > (__rtime=...) at /usr/include/c++/13.2.1/bits/this_thread_sleep.h:80
#3  Tools::sleep_ms (ms=<optimized out>, ms=<optimized out>) at src/btop_tools.hpp:269
#4  Tools::atomic_wait_for(std::atomic<bool> const&, bool, unsigned long) [clone .constprop.0] (old=false, wait_ms=10, atom=...) at src/btop_tools.cpp:488
#5  0x0000561e48b58459 in main (argc=<optimized out>, argv=<optimized out>) at src/btop.cpp:1006

Thread 1 (Thread 0x7f10399fa6c0 (LWP 5721)):
#0  0x0000561e48ba34bf in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) (cpu=..., force_redraw=<optimized out>, data_same=false) at src/btop_draw.cpp:669
#1  0x0000561e48b78fc4 in Runner::_runner () at src/btop.cpp:515
#2  0x00007f103c5989eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#3  0x00007f103c61c7cc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

As for how temps show up using the LTS kernel and the stock arch package, here's a picture:

JC6LHV2.md.png

I haven't used btop with any other CPU, so I'm not sure what a normal display of temperatures should be, but this looks pretty normal and easy to understand for me.

imwints commented 11 months ago

This does not look like the new branch, the gdb output doesn't match. Have you checked out with git checkout sensors before compiling?

putridpete commented 11 months ago

My bad, I did make install on the previous compiled binary by mistake. The new binary also crashed, this time with:

[Thread debugging using libthread_db enabled]                                                                                                                                                
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `btop'.
Program terminated with signal SIGFPE, Arithmetic exception.

warning: Section `.reg-xstate/18715' in core file too small.
#0  0x000055e6cfcd9e9d in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) ()
[Current thread is 1 (Thread 0x7f79af7fe6c0 (LWP 18715))]
(gdb) thread apply all bt

Thread 2 (Thread 0x7f79b6eaf740 (LWP 18709)):
warning: Section `.reg-xstate/18709' in core file too small.
#0  0x00007f79b6a067f5 in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=0x7ffe3518e8e0, rem=0x7ffe3518e8e0) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1  0x00007f79b6a188c7 in __GI___nanosleep (req=<optimized out>, rem=<optimized out>) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2  0x000055e6cfd6d2b6 in Tools::atomic_wait_for(std::atomic<bool> const&, bool, unsigned long) [clone .constprop.0] ()
#3  0x000055e6cfc8e459 in main ()

Thread 1 (Thread 0x7f79af7fe6c0 (LWP 18715)):
#0  0x000055e6cfcd9e9d in Cpu::draw[abi:cxx11](Cpu::cpu_info const&, bool, bool) ()
#1  0x000055e6cfcaef34 in Runner::_runner(void*) ()
#2  0x00007f79b69bd9eb in start_thread (arg=<optimized out>) at pthread_create.c:444
#3  0x00007f79b6a417cc in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

and now btop 2> file shows:

basepath: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp3_ for Tccd1
basepath: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp1_ for Tctl
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp3_ for Sensor 2
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp1_ for Composite
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp2_ for Sensor 1
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp6_ for AUXTIN3
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in3_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp13_ for TSI0_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan3_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in7_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp3_ for AUXTIN0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in12_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in0_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan7_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp10_ for PCH_CHIP_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp7_ for AUXTIN4
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in4_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan4_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in8_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp4_ for AUXTIN1
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in13_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in1_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp11_ for PCH_CPU_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan1_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp8_ for SMBUSMASTER 0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in5_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp1_ for SYSTIN
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in10_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan5_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in9_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp5_ for AUXTIN2
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in14_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in2_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp12_ for PCH_MCH_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan2_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp9_ for PCH_CHIP_CPU_MAX_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in6_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp2_ for CPUTIN
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in11_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan6_ for temp0
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp3_ for mem
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/in0_ for vddgfx
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq1_ for sclk
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/fan1_ for temp0
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp1_ for edge
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq2_ for mclk
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp2_ for junction
cpu_sensor: nct6798/CPUTIN
name: nct6798/temp0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan6_input
  temp:   0
  high:  80
  crit:  95

name: amdgpu/mem
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp3_input
  temp:  38
  high:  80
  crit: 105

name: nct6798/SYSTIN
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp1_input
  temp:  36
  high:  80
  crit:   0

name: nct6798/CPUTIN
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp2_input
  temp:  37
  high:  80
  crit:   0

name: amdgpu/vddgfx
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/in0_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/AUXTIN1
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp4_input
  temp: -62
  high:  80
  crit:   0

name: amdgpu/junction
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp2_input
  temp:  41
  high:  80
  crit: 110

name: amdgpu/edge
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp1_input
  temp:  39
  high:  80
  crit: 110

name: nvme/Composite
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp1_input
  temp:  39
  high:  84
  crit:  84

name: nct6798/AUXTIN4
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp7_input
  temp:  26
  high:  80
  crit:   0

name: nct6798/AUXTIN0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp3_input
  temp:   8
  high:  80
  crit:   0

name: nct6798/PCH_CHIP_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp10_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/AUXTIN3
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp6_input
  temp:  31
  high:  80
  crit:   0

name: nct6798/AUXTIN2
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp5_input
  temp:  13
  high:  80
  crit:   0

name: amdgpu/temp0
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/fan1_input
  temp:   1
  high:   3
  crit:  95

name: amdgpu/sclk
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq1_input
  temp: 20000
  high:  80
  crit:  95

name: nct6798/PCH_MCH_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp12_input
  temp:   0
  high:  80
  crit:  95

name: nvme/Sensor 2
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp3_input
  temp:  36
  high: 65261
  crit:  95

name: nvme/Sensor 1
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp2_input
  temp:  39
  high: 65261
  crit:  95

name: amdgpu/mclk
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq2_input
  temp: 96000
  high:  80
  crit:  95

name: nct6798/PCH_CHIP_CPU_MAX_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp9_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/PCH_CPU_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp11_input
  temp:   0
  high:  80
  crit:  95

name: k10temp/Tctl
  path: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp1_input
  temp:  39
  high:  80
  crit:  95

name: k10temp/Tccd1
  path: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp3_input
  temp:  44
  high:  80
  crit:  95

name: nct6798/TSI0_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp13_input
  temp:  39
  high:  80
  crit:  95

name: nct6798/SMBUSMASTER 0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp8_input
  temp:  39
  high:  80
  crit:  95
imwints commented 11 months ago

Do you have a Ryzen (or AMD processor)? If so, you should be able to edit the cpu_sensor field to

cpu_sensor = "k10temp/Tctl"

in ~/.config/btop/btop.conf, this should fix the problem.

I just quickly looked at the code which determines the temperature sensor and seems to prefer the wrong sensor. It might that you gained some modules when switching to a newer kernel. If you happen to run the LTS kernel again sometime it would be interesting to see the sensor output for that kernel

putridpete commented 11 months ago

That did it! Can confirm that changing said value in ~/.config/btop/btop.conf fixes the problem and btop launches normally.

Edit: also, forgot to answer that yes, I do have a Ryzen cpu, the Ryzen 5 5600X.

And sure, I ran it with the LTS kernel and btop 2> file outputs:

basepath: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp3_ for Tccd1
basepath: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp1_ for Tctl
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp3_ for Sensor 2
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp1_ for Composite
basepath: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp2_ for Sensor 1
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp6_ for AUXTIN3
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in3_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan3_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in7_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp3_ for AUXTIN0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in12_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in0_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan7_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp10_ for PCH_CPU_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp7_ for SMBUSMASTER 0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in4_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan4_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in8_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp4_ for AUXTIN1
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in13_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in1_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp11_ for TSI0_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan1_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp8_ for PCH_CHIP_CPU_MAX_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in5_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp1_ for SYSTIN
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in10_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan5_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in9_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp5_ for AUXTIN2
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in14_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in2_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan2_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp9_ for PCH_CHIP_TEMP
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in6_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp2_ for CPUTIN
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/in11_ for temp0
basepath: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan6_ for temp0
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp3_ for mem
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/in0_ for vddgfx
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq1_ for sclk
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/fan1_ for temp0
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp1_ for edge
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq2_ for mclk
basepath: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp2_ for junction
cpu_sensor: nct6798/CPUTIN
name: nct6798/temp0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/fan6_input
  temp:   0
  high:  80
  crit:  95

name: amdgpu/mem
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp3_input
  temp:  46
  high:  80
  crit: 105

name: nct6798/SYSTIN
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp1_input
  temp:  38
  high:  80
  crit:  95

name: amdgpu/sclk
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq1_input
  temp: 41000
  high:  80
  crit:  95

name: nvme/Sensor 2
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp3_input
  temp:  38
  high: 65261
  crit:  95

name: nct6798/CPUTIN
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp2_input
  temp:  38
  high:  80
  crit:  95

name: nvme/Sensor 1
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp2_input
  temp:  41
  high: 65261
  crit:  95

name: amdgpu/vddgfx
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/in0_input
  temp:   0
  high:  80
  crit:  95

name: amdgpu/mclk
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/freq2_input
  temp: 1000000
  high:  80
  crit:  95

name: nct6798/PCH_CHIP_CPU_MAX_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp8_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/PCH_CPU_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp10_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/AUXTIN1
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp4_input
  temp: -62
  high:  80
  crit:  95

name: amdgpu/junction
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp2_input
  temp:  46
  high:  80
  crit: 110

name: amdgpu/edge
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/temp1_input
  temp:  44
  high:  80
  crit: 110

name: nvme/Composite
  path: /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0/nvme/nvme0/hwmon0/temp1_input
  temp:  41
  high:  84
  crit:  84

name: k10temp/Tctl
  path: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp1_input
  temp:  39
  high:  80
  crit:  95

name: k10temp/Tccd1
  path: /sys/devices/pci0000:00/0000:00:18.3/hwmon/hwmon2/temp3_input
  temp:  40
  high:  80
  crit:  95

name: nct6798/TSI0_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp11_input
  temp:  39
  high:  80
  crit:  95

name: nct6798/AUXTIN0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp3_input
  temp:   8
  high:  80
  crit:  95

name: nct6798/PCH_CHIP_TEMP
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp9_input
  temp:   0
  high:  80
  crit:  95

name: nct6798/AUXTIN3
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp6_input
  temp:  31
  high:  80
  crit:  95

name: nct6798/AUXTIN2
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp5_input
  temp:  13
  high:  80
  crit:  95

name: nct6798/SMBUSMASTER 0
  path: /sys/devices/platform/nct6775.656/hwmon/hwmon3/temp7_input
  temp:  39
  high:  80
  crit:  95

name: amdgpu/temp0
  path: /sys/devices/pci0000:00/0000:00:03.1/0000:05:00.0/0000:06:00.0/0000:07:00.0/hwmon/hwmon1/fan1_input
  temp:   1
  high:   3
  crit:  95

This was done with ~/.config/btop/btop.conf reverted back to having cpu_sensor set as Auto.

aristocratos commented 11 months ago

Nice troubleshooting!

Need to add a crit and high variables check at: https://github.com/aristocratos/btop/blob/9edbf27f1b6d5844a97325fcda732762ba086a99/src/linux/btop_collect.cpp#L319-L322

To properly sanitize any garbage values from the sensors

imwints commented 11 months ago

And the k10temp sensor should probably take precedence over CPUTIN (which if I read that correctly is a motherboard sensor) on Ryzen Cpus?