Open GoogleCodeExporter opened 9 years ago
Thanks for reporting it. Couple questions:
* do you see that hang with earlier versions of gperftools. I'm asking to know
if it's regression introduced recently (which would be easier to track down) or
it's something that happens with older versions.
* Can you post simple program that reproduces the hang?
* if you cannot, please attach gdb to hang process and give us backtraces of
all threads. So that we can see exactly where it hangs.
Original comment by alkondratenko
on 10 Nov 2014 at 7:50
For some reason, I have two pids, and gdb doesn't list the other pid as a
thread... which may be indicative of us doing something strange. For the first
PID:
(gdb) info thread
Id Target Id Frame
* 1 Thread 0xb4e16730 (LWP 4413) 0xb7754430 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7754430 in __kernel_vsyscall ()
#1 0xb7097f5b in read () from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7716d33 in read (__nbytes=<optimized out>, __buf=<optimized out>,
__fd=<optimized out>) at /usr/include/bits/unistd.h:45
#3 SymbolTable::Symbolize (this=0xbfe8a444) at src/symbolize.cc:249
#4 0xb771dcca in HeapProfileTable::Snapshot::ReportLeaks (this=0xae528360,
checker_name=0x121265b8 "_main_",
filename=0xae52c7f0 "/tmp/ampm_pc_debug.4413._main_-end.heap", should_symbolize=true) at src/heap-profile-table.cc:587
#5 0xb770cd8b in HeapLeakChecker::DoNoLeaks (this=0x121235c0,
should_symbolize=HeapLeakChecker::SYMBOLIZE) at src/heap-checker.cc:1857
#6 0xb770d27c in HeapLeakChecker::NoGlobalLeaksMaybeSymbolize
(should_symbolize=HeapLeakChecker::SYMBOLIZE) at src/heap-checker.cc:2147
#7 0xb770d367 in HeapLeakChecker::DoMainHeapCheck () at
src/heap-checker.cc:2169
#8 0xb770d535 in HeapLeakChecker_AfterDestructors () at
src/heap-checker.cc:2317
#9 0xb771017c in HeapLeakCheckerGlobalPrePost::~HeapLeakCheckerGlobalPrePost
(this=0xb7736ab0, __in_chrg=<optimized out>) at src/heap-checker-bcad.cc:82
#10 0xb6b87318 in __cxa_finalize () from /lib/tls/i686/cmov/libc.so.6
#11 0xb7702024 in __do_global_dtors_aux () from /usr/local/lib/libtcmalloc.so
#12 0xb772b760 in _fini () from /usr/local/lib/libtcmalloc.so
#13 0xb7763266 in ?? () from /lib/ld-linux.so.2
#14 0xb6b86f4f in ?? () from /lib/tls/i686/cmov/libc.so.6
#15 0xb6b86fbf in exit () from /lib/tls/i686/cmov/libc.so.6
#16 0x0805b3ef in jump_ship(<anonymous enum>, bool)
(reset_or_powerdown=PACKET_OUT_SYSTEM_POWERDOWN, is_host_kb_exit=true) at
main.cpp:96
#17 0x0805b65d in main (argc=1, argv=0xbfe8b0c4, envp=0x12176000) at
main.cpp:222
For the second PID:
(gdb) info thread
Id Target Id Frame
* 1 Thread 0xb4e16730 (LWP 4424) 0xb77254df in sys_futex (t=<optimized out>,
v=<optimized out>, o=<optimized out>, a=<optimized out>)
at ./src/base/linux_syscall_support.h:2097
(gdb) bt
#0 0xb77254df in sys_futex (t=<optimized out>, v=<optimized out>, o=<optimized
out>, a=<optimized out>) at ./src/base/linux_syscall_support.h:2097
#1 base::internal::SpinLockDelay (w=0xb773e2a0, value=2, loop=2088) at
./src/base/spinlock_linux-inl.h:88
#2 0xb77251e0 in SpinLock::SlowLock (this=0xb773e2a0) at
src/base/spinlock.cc:133
#3 0xb7713819 in Lock (this=<optimized out>) at src/base/spinlock.h:71
#4 SpinLockHolder (this=<optimized out>, l=<optimized out>) at
src/base/spinlock.h:136
#5 tcmalloc::CentralFreeList::InsertRange (this=0xb773e2a0, start=0x1216b950,
end=0x1216ad20, N=106) at src/central_freelist.cc:232
#6 0xb7717730 in tcmalloc::ThreadCache::ReleaseToCentralCache
(this=0x113a38e4, src=0x113a3930, cl=4, N=2) at src/thread_cache.cc:236
#7 0xb77177a0 in tcmalloc::ThreadCache::ListTooLong (this=0x113a38e4,
list=0x113a3930, cl=4) at src/thread_cache.cc:197
#8 0xb7729c2f in Deallocate (cl=<optimized out>, ptr=<optimized out>,
this=<optimized out>) at src/thread_cache.h:390
#9 do_free_helper (heap_must_be_valid=<optimized out>, heap=<optimized out>,
invalid_free_fn=<optimized out>, ptr=<optimized out>) at src/tcmalloc.cc:1195
#10 do_free_with_callback (invalid_free_fn=<optimized out>, ptr=<optimized
out>) at src/tcmalloc.cc:1225
#11 do_free (ptr=<optimized out>) at src/tcmalloc.cc:1234
#12 tc_free (ptr=0x1216b950) at src/tcmalloc.cc:1585
#13 0xb7463e39 in ?? () from /usr/lib/nvidia-current/libGL.so.1
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
Original comment by bryce.schober
on 11 Nov 2014 at 1:18
I tried gpertools 2.2.1 and 2.1 and had the same behavior. I tried strict and
draconian modes, and that made no difference.
Then out of curiousity, I tried some apps other than mine. Since I notice the
opengl library in the backtrace, I tried glxinfo, and it had the same behavior:
$ env LD_PRELOAD=/usr/local/lib/libtcmalloc.so PPROF_PATH=/usr/local/bin/pprof
HEAPCHECK=normal glxinfo
WARNING: Perftools heap leak checker is active -- Performance may suffer
name of display: :0.0
display: :0 screen: 0
direct rendering: Yes
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
server glx extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_swap_control, GLX_EXT_swap_control_tear,
GLX_EXT_texture_from_pixmap, GLX_ARB_create_context,
GLX_ARB_create_context_profile, GLX_EXT_create_context_es_profile,
GLX_EXT_create_context_es2_profile, GLX_ARB_create_context_robustness,
GLX_ARB_multisample, GLX_NV_float_buffer, GLX_ARB_fbconfig_float,
GLX_EXT_framebuffer_sRGB, GLX_NV_multisample_coverage
client glx vendor string: NVIDIA Corporation
client glx version string: 1.4
client glx extensions:
GLX_ARB_get_proc_address, GLX_ARB_multisample, GLX_EXT_visual_info,
GLX_EXT_visual_rating, GLX_EXT_import_context, GLX_SGI_video_sync,
GLX_NV_swap_group, GLX_NV_video_out, GLX_SGIX_fbconfig, GLX_SGIX_pbuffer,
GLX_SGI_swap_control, GLX_EXT_swap_control, GLX_EXT_swap_control_tear,
GLX_ARB_create_context, GLX_ARB_create_context_profile,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float,
GLX_EXT_fbconfig_packed_float, GLX_EXT_texture_from_pixmap,
GLX_EXT_framebuffer_sRGB, GLX_NV_present_video, GLX_NV_copy_image,
GLX_NV_multisample_coverage, GLX_NV_video_capture,
GLX_EXT_create_context_es2_profile, GLX_ARB_create_context_robustness
GLX version: 1.4
GLX extensions:
GLX_EXT_visual_info, GLX_EXT_visual_rating, GLX_SGIX_fbconfig,
GLX_SGIX_pbuffer, GLX_SGI_video_sync, GLX_SGI_swap_control,
GLX_EXT_swap_control, GLX_EXT_swap_control_tear,
GLX_EXT_texture_from_pixmap, GLX_ARB_create_context,
GLX_ARB_create_context_profile, GLX_EXT_create_context_es2_profile,
GLX_ARB_create_context_robustness, GLX_ARB_multisample,
GLX_NV_float_buffer, GLX_ARB_fbconfig_float, GLX_EXT_framebuffer_sRGB,
GLX_NV_multisample_coverage, GLX_ARB_get_proc_address
OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GT 430/PCIe/SSE2
OpenGL version string: 4.2.0 NVIDIA 304.84
OpenGL shading language version string: 4.20 NVIDIA via Cg compiler
<snipped all the rest of the glxinfo output>
Have memory regions w/o callers: might report false leaks
Leak check _main_ detected leaks of 120 bytes in 1 objects
The 1 largest leaks:
<spinning forever>
I see a similar situation with multiple PIDs for glxinfo:
$ ps --forest -g 3240
PID TTY TIME CMD
3240 pts/0 00:00:12 bash
4493 pts/0 00:00:00 \_ glxinfo
4495 pts/0 00:00:00 \_ glxinfo
The backtrace for PID 4493:
(gdb) info thread
Id Target Id Frame
* 1 Thread 0xb54cf700 (LWP 4493) 0xb7740430 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7740430 in __kernel_vsyscall ()
#1 0xb748ef33 in __read_nocancel () from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7702d33 in read (__nbytes=<optimized out>, __buf=<optimized out>,
__fd=<optimized out>) at /usr/include/bits/unistd.h:45
#3 SymbolTable::Symbolize (this=0xbfc39e44) at src/symbolize.cc:249
#4 0xb7709cca in HeapProfileTable::Snapshot::ReportLeaks (this=0xb5352180,
checker_name=0x94d0030 "_main_", filename=0xb5356610
"/tmp/glxinfo.4493._main_-end.heap",
should_symbolize=true) at src/heap-profile-table.cc:587
#5 0xb76f8d8b in HeapLeakChecker::DoNoLeaks (this=0x94cc2c0,
should_symbolize=HeapLeakChecker::SYMBOLIZE) at src/heap-checker.cc:1857
#6 0xb76f927c in HeapLeakChecker::NoGlobalLeaksMaybeSymbolize
(should_symbolize=HeapLeakChecker::SYMBOLIZE) at src/heap-checker.cc:2147
#7 0xb76f9367 in HeapLeakChecker::DoMainHeapCheck () at
src/heap-checker.cc:2169
#8 0xb76f9535 in HeapLeakChecker_AfterDestructors () at
src/heap-checker.cc:2317
#9 0xb76fc17c in HeapLeakCheckerGlobalPrePost::~HeapLeakCheckerGlobalPrePost
(this=0xb7722ab0, __in_chrg=<optimized out>) at src/heap-checker-bcad.cc:82
#10 0xb734f318 in __cxa_finalize () from /lib/tls/i686/cmov/libc.so.6
#11 0xb76ee024 in __do_global_dtors_aux () from /usr/local/lib/libtcmalloc.so
#12 0xb7717760 in _fini () from /usr/local/lib/libtcmalloc.so
#13 0xb774f266 in ?? () from /lib/ld-linux.so.2
#14 0xb734ef4f in ?? () from /lib/tls/i686/cmov/libc.so.6
#15 0xb734efbf in exit () from /lib/tls/i686/cmov/libc.so.6
#16 0xb7336bfe in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#17 0x08048e11 in ?? ()
Backtrace stopped: Not enough registers or memory available to unwind further
The backtrace for PID 4495:
(gdb) info thread
Id Target Id Frame
* 1 Thread 0xb54cf700 (LWP 4495) 0xb77114df in sys_futex (t=<optimized out>,
v=<optimized out>, o=<optimized out>, a=<optimized out>)
at ./src/base/linux_syscall_support.h:2097
(gdb) bt
#0 0xb77114df in sys_futex (t=<optimized out>, v=<optimized out>, o=<optimized
out>, a=<optimized out>) at ./src/base/linux_syscall_support.h:2097
#1 base::internal::SpinLockDelay (w=0xb772a520, value=2, loop=4624) at
./src/base/spinlock_linux-inl.h:88
#2 0xb77111e0 in SpinLock::SlowLock (this=0xb772a520) at
src/base/spinlock.cc:133
#3 0xb76ff819 in Lock (this=<optimized out>) at src/base/spinlock.h:71
#4 SpinLockHolder (this=<optimized out>, l=<optimized out>) at
src/base/spinlock.h:136
#5 tcmalloc::CentralFreeList::InsertRange (this=0xb772a520, start=0x94d8800,
end=0x94d8f80, N=13) at src/central_freelist.cc:232
#6 0xb7703730 in tcmalloc::ThreadCache::ReleaseToCentralCache (this=0x8d4d8e4,
src=0x8d4d93c, cl=5, N=2) at src/thread_cache.cc:236
#7 0xb77037a0 in tcmalloc::ThreadCache::ListTooLong (this=0x8d4d8e4,
list=0x8d4d93c, cl=5) at src/thread_cache.cc:197
#8 0xb7715c2f in Deallocate (cl=<optimized out>, ptr=<optimized out>,
this=<optimized out>) at src/thread_cache.h:390
#9 do_free_helper (heap_must_be_valid=<optimized out>, heap=<optimized out>,
invalid_free_fn=<optimized out>, ptr=<optimized out>) at src/tcmalloc.cc:1195
#10 do_free_with_callback (invalid_free_fn=<optimized out>, ptr=<optimized
out>) at src/tcmalloc.cc:1225
#11 do_free (ptr=<optimized out>) at src/tcmalloc.cc:1234
#12 tc_free (ptr=0x94d8800) at src/tcmalloc.cc:1585
#13 0xb7664e39 in ?? () from /usr/lib/nvidia-current/libGL.so.1
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
I also tried a non-opengl app and instead of locking up on termination, it
ended with this:
WARNING: Perftools heap leak checker is active -- Performance may suffer
<app output snipped>
Have memory regions w/o callers: might report false leaks
Leak check _main_ detected leaks of 104 bytes in 11 objects
The 3 largest leaks:
Using remote profile at netstat.
Use of uninitialized value $line in substitution (s///) at /usr/local/bin/pprof
line 3214.
http://netstat/pprof/symbol doesn't exist
Leak of 60 bytes in 5 objects allocated from:
@ 80505d6
@ 8049fd1
@ 804a72d
@ 804decc
@ b75c1bf6
@ 8049811
<etc...>
What caught my eye was the remote profile message and wierd http URL, what's
going on there?
Original comment by bryce.schober
on 11 Nov 2014 at 1:38
I was unable to reproduce your problem.
But I there's probably something with pprof. Symbolize function which is seen
in one of stack traces is spawning pprof to convert addresses to function
names. And given that it's stuck in read it appears that pprof might be still
alive.
In order to diagnose I propose you to report output of ps aux --forest here.
Also maybe it might make sense try linking your app to libtcmalloc.so instead
of LD_PRELOAD-ing it.
Original comment by alkondratenko
on 29 Nov 2014 at 7:16
Original issue reported on code.google.com by
bryce.schober
on 10 Nov 2014 at 7:33