Open GoogleCodeExporter opened 9 years ago
It looks like I spoke too soon. It appears that the library doesn't have
symbols, which would explain why the cpu profiler was unable to find them.
$ ldd a.out | grep gomp
libgomp.so.1 => /usr/lib64/libgomp.so.1 (0x00007f674c22f000)
$ nm /usr/lib64/libgomp.so.1
nm: /usr/lib64/libgomp.so.1: no symbols
$ objdump -d /usr/lib64/libgomp.so.1
... <snip> ...
8270: e9 fb 16 00 00 jmpq 9970 <omp_get_num_procs+0x210>
8275: 80 7f 5c 00 cmpb $0x0,0x5c(%rdi)
8279: 48 c7 47 08 00 00 00 movq $0x0,0x8(%rdi)
8280: 00
8281: 0f 84 f0 fe ff ff je 8177 <GOMP_taskwait+0x3d7>
8287: 48 83 c7 60 add $0x60,%rdi
828b: 44 89 e8 mov %r13d,%eax
828e: f0 44 0f b1 3f lock cmpxchg %r15d,(%rdi)
8293: 0f 84 de fe ff ff je 8177 <GOMP_taskwait+0x3d7>
8299: e8 b2 15 00 00 callq 9850 <omp_get_num_procs+0xf0>
829e: e9 d4 fe ff ff jmpq 8177 <GOMP_taskwait+0x3d7>
... <snip> ...
I'd probably hope that if this were encountered, some sort of 'failure to
identify function name' or the function address would be emitted, but the lack
of symbols does not appear to the fault of the profiler.
Original comment by awre...@gmail.com
on 11 May 2013 at 3:11
Your test program appears too short to identify any bottlenecks.
Can you please retry something longer ?
Original comment by alkondratenko
on 6 Jul 2013 at 8:20
The test program is intentionally short to demonstrate the issue and has no
bottlenecks.
Please see my previous comment: the issue is that the library doesn't have
symbols.
Original comment by awre...@gmail.com
on 6 Jul 2013 at 9:34
Not at all. nm is not supposed to work on shared libraries. And your test is
too short to provide any relevant profile. Please consider reading how sampling
profiling works.
Running somebody's program at link you provided gives me this:
2944 97.8% 97.8% 2944 97.8% CALC_VALS._omp_fn.0
64 2.1% 100.0% 64 2.1% omp_get_num_procs
1 0.0% 100.0% 1 0.0% INITIALISE
Which clearly shows that majority of time is spent in user program.
omp_get_num_procs indeed seems a bit off. But it appears that there's not much
we can do about that in gperftools.
Original comment by alkondratenko
on 6 Jul 2013 at 11:10
omp_get_num_procs isn't a bit off, it *isn't* used by the program. Other
functions are being misattributed to it:
$ pprof --disas=GOMP_taskwait solver out.prof
ROUTINE ====================== GOMP_taskwait
1 1330 samples (flat, cumulative) 96.2% of total
... <snip> ...
. 16 84ef: callq 9ca0 <omp_get_num_procs+0x540>
. . 84f4: nopl 0x0(%rax)
. . 84f8: mov %fs:0x10(%rbx),%r13
. . 84fd: mov %r12,%rdi
. 695 8500: callq *%rbp
. . 8502: lea 0x80(%r13),%rdi
. 391 8509: callq 9b40 <omp_get_num_procs+0x3e0>
. . 850e: mov %r14,%rdi
. 156 8511: callq 9ca0 <omp_get_num_procs+0x540>
... <snip> ...
Also, gdb can't get the function name for omp either:
https://gist.github.com/awreece/5941661 (it has ?? for the name). I could try
to come up with a better example, but I think the snippet above clearly
demonstrates the problem. Sorry about not using the right flag to nm, nm -D
clearly shows some symbols. Its no longer clear to me the problem is "it
doesn't export symbols" (I'll have to update the blog post later).
For a longer explanation:
http://codearcana.com/posts/2013/05/10/why-is-omp_get_num_procs-so-slow.html
Original comment by awre...@gmail.com
on 7 Jul 2013 at 12:05
Original issue reported on code.google.com by
awre...@gmail.com
on 11 May 2013 at 2:38