Closed GoogleCodeExporter closed 9 years ago
In general, oprofile is better for that kind of analysis, if you're able to run
it on
the binary. that said, ITIMER_REAL may work as well. We'll keep this under
consideration.
In the meantime, of course, you're welcome to modify the source code yourself
for
your own projects -- maybe you can report back here how well it works!
Original comment by csilv...@gmail.com
on 6 Feb 2009 at 3:14
I talked to an expert on profiling here, and he had the following to say:
---
Assuming multi-threaded and a modern thread system on linux (NPTL), no that
won't work. It *would* work for single-threaded apps. It would also work
for LinuxThreads (but "performance tip #1 for people using LinuxThreads" is,
IMO, "STOP!!!" 8-).
The problem is that under POSIX-compliant threading systems (NPTL is,
LinuxThreads isn't), the interval timers are shared by all the threads in
the process.
This works OK for ITIMER_PROF (and ITIMER_VIRTUAL): you get one tick per
"<interval> CPU seconds consumed" which is exactly what you want. The
thread which causes the timer to run down to 0 gets hit with the signal, and
(over a period of time, unless the threads are doing something Interesting,
e.g., periodic behaviour that syncs up with the profiler's period) the
profiler ticks will be distributed across multiple threads in proportion to
their CPU usage. (Using a per-thread timer on Linux might produce more
accurate results in some cases, but it has some disadvantages too, e.g.,
only supported in recent kernels and won't necessarily account for
short-lived threads).
For ITIMER_REAL, you're getting one tick per "<interval> real-time seconds
consumed". If you have multiple threads, you then have to "distribute" that
tick to the rest of the threads, and collect data from them. This isn't
particularly easy, as there's no standard mechanism to enumerate all threads
in a process. (One might try to use
"<interval>/N" where N is the number of threads instead of <interval>,
hoping for a uniform distribution... but that's just bogus. First, you need
to keep track of N, and second the signal distribution won't be uniform. If
any thread in the process is active, signals like SIGALRM delivered by
ITIMER_REAL will almost certainly be delivered to the running thread.)
Note also that use of ITIMER_REAL/SIGALRM will screw up any other uses of
SIGALRM in the process. (Some library functions use it, some people write
code to use it directly.) (SIGPROF has the same issue... but no library
functions use SIGPROF, and very few people use it for "other stuff" on my
experience.)
---
This last point is a showstopper, I think. Too many apps use SIGALRM for me to
be
comfortable using it in a basic library like this (even if not by default).
There are ways around all these problems, and a wall-time profiler is possible,
but
it works by spawning a new thread to do the timing, and is basically a totally
separate design from the profiler we have now. In other words: it would be a
lot of
work. :-/ I'll keep this in mind, but am lowering the priority in light of the
fact
there's not much synergy in doing this as part of the existing perftools
codebase.
Original comment by csilv...@gmail.com
on 17 Feb 2009 at 11:05
there's no oprofile for doze ... sniff, sniff
Original comment by rogerpack2005
on 21 Jan 2010 at 5:14
Alas, we're a long time away supporting a cpu profiler under windows in any
case (I
don't think it supports unix-style timer interrupts at all). But we did add
ITIMER_REAL support in perftools 1.5 (or was in 1.4?), so I guess I can close
this
bug actually!
Original comment by csilv...@gmail.com
on 3 Feb 2010 at 10:33
Original issue reported on code.google.com by
mohit.a...@gmail.com
on 5 Feb 2009 at 7:40