cyring / CoreFreq

CoreFreq : CPU monitoring and tuning software designed for 64-bit processors.
https://www.cyring.fr
GNU General Public License v2.0
1.96k stars 126 forks source link

corefreq-cli -d receives SIGSEGV and SIGBUS #125

Closed JohnAZoidberg closed 5 years ago

JohnAZoidberg commented 5 years ago

When I run corefreq-cli -d it sometimes crashes before showing anything and sometimes it gets to show the output for a second and then crashes. All other options seem to be running fine - here's my system output.

$ ./corefreq-cli -d
fish: “./corefreq-cli -d” terminated by signal SIGSEGV (Address boundary error)

Fish is the name of my shell.

$  gdb --args corefreq-cli -d
(gdb) run
Starting program: /home/zoid/media/clone/reference/CoreFreq/corefreq-cli -d
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libthread_db.so.1".

Program received signal SIGBUS, Bus error.
0x0000000000407628 in StateToSymbol ()
(gdb) where
#0  0x0000000000407628 in StateToSymbol ()
#1  0x0000000000407d5e in Draw_Card_Task ()
#2  0x0000000000402c39 in Draw_Dashboard ()
#3  0x000000000042686d in Top ()
#4  0x0000000000402a3f in main ()

Yes, I booted with nmi_watchdog=0 on the kernel commandline. I built from the latest master 1b2d21ffbd52703cca6f0e814a01bc69edae6de2.

cyring commented 5 years ago

Thanks for your issues and PR I will take time this WE to process them. Come back to you ASAP

JohnAZoidberg commented 5 years ago

Is there any other information I can provide you with?

cyring commented 5 years ago

Yes:

  1. in screenshots the Client background color is white. Is it a terminal choice ?
  2. The one core Turbo Boost is supposed to go to 3.4 GHz. Are you stressing with my internal tool ?
  3. According to the mwait sub-states, your Processor is down to C10 capable : I'm improving the idle sub-driver. Can you test this module option and check the resulting Core and Package C-States are well entered ?

I appreciate a lot your help. Thank you.

JohnAZoidberg commented 5 years ago
  1. Yes, it's this is the solarized light theme of my terminal - it's not white it's #fdf6e3
  2. I'm stressing it with stress stress --cpu 1. How do I use the internal stress tester?
  3. I should use Register_CPU_Idle=1? But for that I also need to blacklist Intel's idle driver? I did that:
    $ modprobe -c | grep corefreq
    options corefreqk Register_CPU_Idle=1
    $ journalctl -k | grep intel_cstate
    Jun 07 23:57:18 think-nix kernel: Kernel command line: initrd=\efi\nixos\initrd.efi loglevel=4 nmi_watchdog=0 modprobe.blacklist=intel_cstate idle=halt intel_idle.max_cstate=0 mitigations=off transparent_hugepage=never

    But corefreq-cli still only show up until C7 image Is that what you meant? Ohhh, I just noticed - it doesn't show another C-state but the dashboard doesn't crash! Edit: Actually it does, but it ran much much longer the first time. Now it behaves the same as before. Just a few seconds of running.

I'm running Linux 5.1.5.

JohnAZoidberg commented 5 years ago

I have to use -DCONFIG_CPU_IDLE to build the idle driver into the kernel module, right? Now it shows:

$ cat /sys/devices/system/cpu/cpuidle/current_driver
corefreqk-idle

but corefreq-cli is still the same.

cyring commented 5 years ago

Thanks for your returns.

  1. corefreqk-idle : the sub-driver has been well registered.

Press [k] (kernel data) or [s] (settings) to see confirmation.

C7 is so far the deepest Core C-States implemented in the monitoring view. However, press [g] to monitor the Package C-States down to C10

You can also try to register the CPU frequency sub-driver. It probably needs to blacklist the other ACPI, pcc, p-state drivers to let CoreFreq takes the room. When registered, you have the full control to set the min, max, target HWP frequencies, press [p] ; but also the energy policy in the Power window.

Remark: when changing HWP policies, frequencies, watch the resulting Vcore and your Turbo highest performance frequency ratio. You may find for example the optimum processor settings for H24 use.

  1. About the integrated stressing function : just press [F3] and choose among the algorithms. A red dot appears beside the LCD when a stressing function is processing. Press [F10] to stop stressing at any time

  2. The UI has been designed for a black background, but I had centralized the theme resources in header files to let users customize them.

If you are in need of a full "color" theme, I can refactor the code to instantiate a new set and provide a chooser option ?

JohnAZoidberg commented 5 years ago

Now with the idle driver, sometimes it crashes on the regular command and won't start working again until a reboot:

$ gdb ./corefreq-cli
Reading symbols from ./corefreq-cli...done.
(gdb) run
Starting program: /home/zoid/media/clone/reference/CoreFreq/corefreq-cli
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libthread_db.so.1".
*** buffer overflow detected ***: /home/zoid/media/clone/reference/CoreFreq/corefreq-cli terminated

Program received signal SIGABRT, Aborted.
0x00007ffff7cafbe0 in raise () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
(gdb) where
#0  0x00007ffff7cafbe0 in raise () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#1  0x00007ffff7cb0dc1 in abort () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#2  0x00007ffff7cf12ac in __libc_message () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#3  0x00007ffff7d7df68 in __fortify_fail_abort () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#4  0x00007ffff7d7df81 in __fortify_fail () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#5  0x00007ffff7d7c070 in __chk_fail () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#6  0x00007ffff7d7b5d9 in _IO_str_chk_overflow () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#7  0x00007ffff7cf51cb in _IO_default_xsputn () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#8  0x00007ffff7cc7d01 in vfprintf () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#9  0x00007ffff7d7b66c in __vsprintf_chk () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#10 0x00007ffff7d7b5bd in __sprintf_chk () from /nix/store/681354n3k44r8z90m35hm8945vsp95h1-glibc-2.27/lib/libc.so.6
#11 0x000000000042476b in Layout_Ruller_Load ()
#12 0x0000000000425589 in Layout_Header_DualView_Footer ()
#13 0x00000000004269a5 in Top ()
#14 0x0000000000402a4f in main ()
(gdb)

It also crashes when I show the kernel config and then the settings, i.e. press [k] and then [s].

cyring commented 5 years ago

$ grep FLAG /etc/makepkg.conf

ARCHITECTURE, COMPILE FLAGS

CPPFLAGS="-D_FORTIFY_SOURCE=2" CFLAGS="-march=westmere -mtune=westmere -O2 -pipe -fno-plt" CXXFLAGS="-march=westmere -mtune=westmere -O2 -pipe -fno-plt" LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now" MAKEFLAGS="-j16" DEBUG_CFLAGS="-g -fvar-tracking-assignments" DEBUG_CXXFLAGS="-g -fvar-tracking-assignments"

cyring commented 5 years ago

Latest commit may fix the Idle States enumeration on your Processor architecture:

insmod corefreqk.ko Register_CPU_Idle=1
cyring commented 5 years ago

Closing the issue if latest version is stable ?