yanqi27 / core_analyzer

A power tool to debug memory-related issues
376 stars 72 forks source link

error encountered using heap command #96

Closed jqguo367 closed 9 months ago

jqguo367 commented 1 year ago

I successfully build gdb 7.11.1, but when I use to it analyze heap for a core file, error reported. The core file was generated by gcore command for a process which was running well. So, I believe there is no memory corruption. glibc version is 2.17.

(gdb) heap Failed to read heap_info at 0xec000000 Tuning params & stats: mmap_threshold=131072 pagesize=4096 n_mmaps=0 n_mmaps_max=65536 total mmap regions created=0 mmapped_mem=0 sbrk_base=0xa268000 Main arena (0xf73a5420) owns regions: [0x55202b2b354499c0 - 0x55202b2b354aa9b0] Total 387KBFailed to get the first chunk at 0x55202b2b354499b0

    Dynamic arena (0xede00010) owns regions:

1 Errors encountered while walking the heap! [Error] Failed to walk heap

(gdb) heap /v Tuning params & stats: mmap_threshold=131072 pagesize=4096 n_mmaps=0 n_mmaps_max=65536 total mmap regions created=0 mmapped_mem=0 sbrk_base=0xa268000 Main arena (0xf73a5420) owns regions: [0x55202b2b354499c0 - 0x55202b2b354aa9b0] Total 387KBFailed to get the first chunk at 0x55202b2b354499b0

    Dynamic arena (0xede00010) owns regions:

1 Errors encountered while walking the heap! [Error] Failed to walk heap

yanqi27 commented 1 year ago

It looks like that core analyzer doesn't parse heap data correctly. What is the platform/OS version?

jqguo367 commented 1 year ago

cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.8 (Maipo)

yanqi27 commented 1 year ago

RHEL 7.8 is kind of old. There may be a break. I will try it later.

jqguo367 commented 1 year ago

Why does OS release also affect heap walking ?

yanqi27 commented 1 year ago

Different OS comes with different runtime including ptmalloc version.

jqguo367 commented 1 year ago

On RHEL 8.7, I also met error.
The gdb I successfully built is gdb 8.1.

glibc version is 2.28

cat /etc/redhat-release Red Hat Enterprise Linux release 8.7 (Ootpa)

(gdb) heap Failed to extract heap metadata from gv mp_

== The memory manager is assumed to be glibc 2.28 == == If this is not true, please debug with another machine with matching glibc ==

[Error] Failed to walk heap (gdb) heap /v [Error] Failed to walk heap (gdb) p mp_ $1 = {trim_threshold = 131072, top_pad = 131072, mmap_threshold = 131072, arena_test = 8, arena_max = 0, n_mmaps = 0, n_mmaps_max = 65536, max_n_mmaps = 0, no_dyn_threshold = 0, mmapped_mem = 0, max_mmapped_mem = 0, sbrk_base = 0x0, tcache_bins = 64, tcache_max_bytes = 1032, tcache_count = 7, tcache_unsorted_limit = 0}

yanqi27 commented 1 year ago

@jqguo367 could you try gdb12.1? it is the most tested.

jqguo367 commented 1 year ago

I have no environ to build gdb 12.1. My build failed and I have no permission to install makeinfo.

/home/jqguo/gdbplus/core_analyzer-2.23.0/build/gdb-12.1/missing: line 81: makeinfo: command not found WARNING: 'makeinfo' is missing on your system. You should only need it if you modified a '.texi' file, or any other file indirectly affecting the aspect of the manual. You might want to install the Texinfo package: http://www.gnu.org/software/texinfo/ The spurious makeinfo call might also be the consequence of using a buggy 'make' (AIX, DU, IRIX), in which case you might want to install GNU make: http://www.gnu.org/software/make/ make[3]: *** [doc/bfd.info] Error 127

Celthi commented 1 year ago

When heap command fails, usually we should try to see if the debug symbol of the libc is correctly installed. I'm thinking if we can check in the code or print some useful information when the heap walk command fail to aid better troubleshooting. @yanqi27 do you think we can enhance the output message?

Some troubleshooting tips:

  1. Run info shared to see if the debug symbol for libs are loaded corretly.
  2. Confirm the lib version of the core dump.
jqguo367 commented 1 year ago

Can heap command support 32 bit binary ? The core file I am analysing is from a 32 bit process.

yanqi27 commented 1 year ago

@Celthi Yeah, we should enhance the error message. @jqguo367 The old versions of core analyzer support 32bit target.

Celthi commented 9 months ago

close stale issue.