Closed desbma closed 10 months ago
I can provide the core dump, or my Conky config if needed.
Do that in first place as it would had helped. If you're having this issue, open a terminal and run gdb conky
then (gdb) run -c ~/your_conky.conf
. Wait for a crash to occur then do (gdb) bt full
to get a backtrace. We want that backtrace. Also, don't forget your conky config. Thanks.
Will do, but you already have a stack trace above which shows abort is called from malloc. It seem to occur mostly when I am under high CPU load.
My config: https://pastebin.com/raw/d7C5KQXr
We need the bt full
to debug even further.
Last 2 core dumps:
Without a debug build, bt full
is useless:
gdb $(which conky) 25.936 3877 19:37:23
GNU gdb (GDB) 8.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/conky...(no debugging symbols found)...done.
(gdb) core conky_1.core
[New LWP 1364]
[New LWP 1397]
[New LWP 1400]
[New LWP 1401]
[New LWP 1402]
[New LWP 1403]
[New LWP 1404]
[New LWP 1406]
[New LWP 1407]
[New LWP 1471]
[New LWP 1472]
[New LWP 1398]
[New LWP 1405]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/usr/bin/conky'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007fac1ba19b5f in raise () from /usr/lib/libc.so.6
[Current thread is 1 (Thread 0x7fac153a8f40 (LWP 1364))]
(gdb) bt full
#0 0x00007fac1ba19b5f in raise () from /usr/lib/libc.so.6
No symbol table info available.
#1 0x00007fac1ba04452 in abort () from /usr/lib/libc.so.6
No symbol table info available.
#2 0x00007fac1ba5c658 in __libc_message () from /usr/lib/libc.so.6
No symbol table info available.
#3 0x00007fac1ba62f6a in malloc_printerr () from /usr/lib/libc.so.6
No symbol table info available.
#4 0x00007fac1ba664c0 in _int_malloc () from /usr/lib/libc.so.6
No symbol table info available.
#5 0x00007fac1ba68516 in calloc () from /usr/lib/libc.so.6
No symbol table info available.
#6 0x0000564ee54ee4d0 in construct_text_object(char*, char const*, long, void**, void*) ()
No symbol table info available.
#7 0x0000564ee54f7950 in extract_variable_text_internal(text_object*, char const*) ()
No symbol table info available.
#8 0x0000564ee54ea138 in evaluate(char const*, char*, int) ()
No symbol table info available.
#9 0x0000564ee54f904c in print_exec(text_object*, char*, int) ()
No symbol table info available.
#10 0x0000564ee54e8743 in generate_text_internal(char*, int, text_object) ()
No symbol table info available.
#11 0x0000564ee54e89e9 in ?? ()
No symbol table info available.
#12 0x0000564ee54e9d40 in ?? ()
No symbol table info available.
#13 0x0000564ee54d724b in main ()
No symbol table info available.
Can you crash the current master branch ?
git clone https://github.com/brndnmtthws/conky
cd conky
mkdir -p build
cd build
cmake ..
make -j4 # 4 cores to run in parallel
To add debugging flag:
diff --git a/cmake/ConkyBuildOptions.cmake b/cmake/ConkyBuildOptions.cmake
index e9584c9d..c4602c25 100644
--- a/cmake/ConkyBuildOptions.cmake
+++ b/cmake/ConkyBuildOptions.cmake
@@ -32,8 +32,8 @@ if(NOT CMAKE_BUILD_TYPE)
endif(NOT CMAKE_BUILD_TYPE)
# -std options for all build types
-set(CMAKE_C_FLAGS "-std=c99 ${CMAKE_C_FLAGS}" CACHE STRING "Flags used by the C compiler during all build types." FORCE)
-set(CMAKE_CXX_FLAGS "-std=c++17 ${CMAKE_CXX_FLAGS}" CACHE STRING "Flags used by the C++ compiler during all build types."
FORCE)
+set(CMAKE_C_FLAGS "-std=c99 -g2 ${CMAKE_C_FLAGS}" CACHE STRING "Flags used by the C compiler during all build ty
pes." FORCE)
+set(CMAKE_CXX_FLAGS "-std=c++17 -g2 ${CMAKE_CXX_FLAGS}" CACHE STRING "Flags used by the C++ compiler during all
build types." FORCE)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
I have recompiled my current version with -ggdb
, sorry I don't have time to test with master now, as I would have to update the lua patch the package carries.
I will report back the next time it crashes.
3 more core dumps with conky compiled with -ggdb3
:
Output of bt full
:
$ gdb $(which conky) 4303 00:43:39
GNU gdb (GDB) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/conky...done.
(gdb) core core_3
[New LWP 1330]
[New LWP 1354]
[New LWP 1468]
[New LWP 1356]
[New LWP 1367]
[New LWP 1363]
[New LWP 1370]
[New LWP 1353]
[New LWP 1366]
[New LWP 1371]
[New LWP 1466]
[New LWP 1355]
[New LWP 1368]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `/usr/bin/conky'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fee00bfed53 in free () from /usr/lib/libc.so.6
[Current thread is 1 (Thread 0x7fedfab3bf40 (LWP 1330))]
(gdb) bt full
#0 0x00007fee00bfed53 in free () from /usr/lib/libc.so.6
No symbol table info available.
#1 0x00007fee00badb19 in _nl_make_l10nflist.localalias.0 () from /usr/lib/libc.so.6
No symbol table info available.
#2 0x00007fee00bab549 in _nl_find_domain () from /usr/lib/libc.so.6
No symbol table info available.
#3 0x00007fee00baae73 in __dcigettext () from /usr/lib/libc.so.6
No symbol table info available.
#4 0x0000555beae157eb in human_readable (num=<optimized out>, buf=0x555bebb8d5bc "", size=15380) at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:799
suffix = <optimized out>
fnum = <optimized out>
precision = 2
width = 5
format = 0x555beae859b3 "%.*f%.1s"
#5 0x0000555beae13b83 in generate_text_internal (p=0x555bebb8d5bc "", p@entry=0x555bebb8d1d0 "\001\001CPU \001\001\n\001Global: \001 58°C\001\001\061\061%\n\001\n\001Load avg 1min: \001\001\061,45\n\n\001Core 0: \001\063,47 GHz \001\001 53°C\001\n\001\001 4%\n\001Core 1: \001\063,47 GHz \001\001 55°C\001\n\001\001 8%\n\001Core 2: \001\063,47 GHz \001\001 50°C\001\n\001\001 2%\n\001Core 3: \001\063,47 GHz \001\001 62°C\001\n\001\001"..., p_max_size=15380, root=...) at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:872
obj = 0x555bebb88d40
a = <optimized out>
#6 0x0000555beae13e29 in generate_text () at /usr/include/c++/8.2.0/bits/unique_ptr.h:342
i = <optimized out>
k = <optimized out>
mw = <optimized out>
tbs = <optimized out>
ui = <optimized out>
p = 0x555bebb8d1d0 "\001\001CPU \001\001\n\001Global: \001 58°C\001\001\061\061%\n\001\n\001Load avg 1min: \001\001\061,45\n\n\001Core 0: \001\063,47 GHz \001\001 53°C\001\n\001\001 4%\n\001Core 1: \001\063,47 GHz \001\001 55°C\001\n\001\001 8%\n\001Core 2: \001\063,47 GHz \001\001 50°C\001\n\001\001 2%\n\001Core 3: \001\063,47 GHz \001\001 62°C\001\n\001\001"...
j = <optimized out>
time = <optimized out>
p = <optimized out>
i = <optimized out>
j = <optimized out>
k = <optimized out>
mw = <optimized out>
tbs = <optimized out>
ui = <optimized out>
time = <optimized out>
tmp_p = <optimized out>
#7 update_text() () at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:2044
No locals.
#8 0x0000555beae15180 in main_loop() () at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:2145
fdsr = {fds_bits = {0 <repeats 16 times>}}
tv = {tv_sec = 0, tv_usec = 0}
s = <optimized out>
terminate = 0
t = <optimized out>
inotify_config_wd = 1
inotify_buff = '\000' <repeats 120 times>, "\326\r\031\001\356\177", '\000' <repeats 34 times>, "\210\376$\001\356\177\000\000\371\204\260\002\356\177\000\000@\205\260\002\356\177\000\000\210\376$\001\356\177\000\000\220\207\260\353[U\000\000"...
#9 0x0000555beae026ab in main () at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:3211
curl_global = <optimized out>
#10 0x00007fee00b9d003 in __libc_start_main () from /usr/lib/libc.so.6
No symbol table info available.
#11 0x0000555beae0696e in _start () at /tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:3263
No symbol table info available.
(gdb)
Any progress/idea on this ?
I can still reproduce the crash frequently, if that helps I can recompile with different flags (ASAN ?), as long as it does not negatively affect performance too much.
Here's what you should do. Instead starting conky the normal way, start it from gdb:
gdb /path/to/conky
# Once inside run it with the config you have
run -c /path/to/conky.conf
# wait for it to crash
bt full
# save the coredump
generate-core-file
# see which sex address contains the call to "free"
# in your bt full output it was something like this
#0 0x00007fee00bfed53 in free () from /usr/lib/libc.so.6
# this is just example hex address
list *0x0000555beae157eb
edit:
By the way the coredumps are empty on my machine, loaded them with core this_core
and nothing shows up. Next time attach the compiled conky binary too, so we can debug it inside gdb.
@su8
Here is my /usr/bin/conky
binary : https://www.dropbox.com/s/ozir20nkmvz71ld/conky.xz?dl=1
Unfortunately, it is not easy for me to run conky in gdb and "wait for it to crash". I start conky at boot with a systemd service, and sometimes it crashes and I can then get the core dump, but sometimes I can run it for hours or days without issues.
Okay thanks for the binary, can you open up the coredump that has free
in it and run list *0xaddress
where 0xaddress is the address attached to free
? I think it's the 3rd coredump.
(gdb) list *0x0000555beae157eb
0x555beae157eb is in human_readable(long long, char*, int) (/tmp/makepkg/conky-nvidia/src/conky-1.10.8/src/conky.cc:799).
794 if (fnum < 99.95)
795 precision = 1; /* print 10-99 with one decimal place */
796 if (fnum < 9.995)
797 precision = 2; /* print 0-9 with two decimal places */
798
799 spaced_print(buf, size, format, width, precision, fnum, _(*suffix));
800 }
801
802 /* global object list root element */
803 static struct text_object global_root_object
Thanks, can you list *0x00007fee00bfed53
?
No output (probably because glibc does not have debug symbols):
(gdb) list *0x00007fee00bfed53
(gdb)
In case you are fluent with x86 assembly:
(gdb) x/16i 0x00007fee00bfed53
=> 0x7fee00bfed53 <_mid_memalign+307>: nopl 0x0(%rax,%rax,1)
0x7fee00bfed58 <_mid_memalign+312>: mov 0x1380e1(%rip),%rax # 0x7fee00d36e40
0x7fee00bfed5f <_mid_memalign+319>: xor %edx,%edx
0x7fee00bfed61 <_mid_memalign+321>: movl $0x16,%fs:(%rax)
0x7fee00bfed68 <_mid_memalign+328>: jmp 0x7fee00bfed28 <_mid_memalign+264>
0x7fee00bfed6a <_mid_memalign+330>: nopw 0x0(%rax,%rax,1)
0x7fee00bfed70 <_mid_memalign+336>: mov 0x137f91(%rip),%rax # 0x7fee00d36d08
0x7fee00bfed77 <_mid_memalign+343>: mov %fs:(%rax),%r12
0x7fee00bfed7b <_mid_memalign+347>: test %r12,%r12
0x7fee00bfed7e <_mid_memalign+350>: je 0x7fee00bfee5a <_mid_memalign+570>
0x7fee00bfed84 <_mid_memalign+356>: mov $0x1,%esi
0x7fee00bfed89 <_mid_memalign+361>: xor %eax,%eax
0x7fee00bfed8b <_mid_memalign+363>: cmpl $0x0,0x13d79e(%rip) # 0x7fee00d3c530 <__libc_multiple_threads>
0x7fee00bfed92 <_mid_memalign+370>: je 0x7fee00bfed9e <_mid_memalign+382>
0x7fee00bfed94 <_mid_memalign+372>: lock cmpxchg %esi,(%r12)
0x7fee00bfed9a <_mid_memalign+378>: jne 0x7fee00bfeda5 <_mid_memalign+389
I am on Arch Linux, and packages get updated very often to the last upstream version, so library address do not match anymore for some library compared to older core dumps:
warning: .dynamic section for "/usr/lib/libX11.so.6" is not at the expected address (wrong library or version mismatch?)
warning: .dynamic section for "/usr/lib/libX11-xcb.so.1" is not at the expected address (wrong library or version mismatch?)
warning: .dynamic section for "/usr/lib/libgraphite2.so.3" is not at the expected address (wrong library or version mismatch
Fortunately, glibc is still the same version as the one of the core dump.
Thanks for the information. I was hoping that it would lead us to right direction and point us the incorrect free()
usage across the code.
human_readable()
is working fine by setting most variables to 0, it didn't caused any segfaults.
FYI the crash still occurs with Conky 1.11.2.
This issue is stale because it has been open 365 days with no activity. Remove stale label or comment, or this issue will be closed in 30 days.
This issue was closed because it has been stalled for 30 days with no activity.
I am experienceing random Conky crashes, on Arch Linux, using Conky 1.10.8, from this package.
I am starting Conky with a systemd service, here is the full log:
I can provide the core dump, or my Conky config if needed.