cuitao2046 / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Profiling timer always armed on initialization #406

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
The code currently arms the profiling timer unconditionally in 
ProfileHandler::RegisterThread for timer sharing detection. Unfortunately, this 
means the profiling timer will stay active, even if profiling is inactive and 
we have shared timers in case if no other threads get registered later.

This poses a problem for Chromium, which links the profiling code into the 
executable. See http://code.google.com/p/chromium/issues/detail?id=115149

Original issue reported on code.google.com by mniss...@google.com on 21 Feb 2012 at 8:50

GoogleCodeExporter commented 9 years ago
Proposed fix is attached. See http://codereview.chromium.org/9416072 for more 
information.

Original comment by mniss...@google.com on 21 Feb 2012 at 9:07

Attachments:

GoogleCodeExporter commented 9 years ago
From Mattias:

The patch is a bit tricky, here is some background (taken from the chromium 
code review):

The issue with the timer setup code is that older kernels and pthread
implementations (linuxthreads) incorrectly implemented timers per-thread instead
of per-process as mandated by POSIX. Arming timers is much harder in that model,
since the setitimer call must actually be issued by each thread that's supposed
to be profiled.

The current gperftools code would just arm the timers on thread startup, no
matter whether profiling was enabled or not, which obviously isn't an option for
us.

The idea behind my patch is to keep timers disabled initially. In order to get
them armed, the code sends SIGPROF signals to the individual threads via
pthread_kill(), at which point the thread check whether they need to arm their
timers.

In theory, this works just fine, but there is a caveat: In order to send
SIGPROFs, I need the thread identifiers of the registered threads. It's no big
deal to record them when they register. However, a problem arises at thread
termination. In theory, the pthread identifier for the thread can become invalid
when the thread terminates (in case of detached threads) or when the
corresponding pthread_join() call completes (for non-detached threads). However,
the profiling code doesn't have a simple way to learn about thread terminating
and/or thread identifiers becoming invalid. So I have added an UnregisterThread
function that threads should call at termination.

In practice, UnregisterThread is mostly not required, since pthread_kill() will
return ESRCH if it doesn't know the thread. That's assuming that the thread
identifier is still usable at that point though, which pthread doesn't
guarantee. For linuxthreads, that assumption holds. For NPTL, it doesn't hold in
all cases, and I have indeed been able to crash pthread_kill for an exotic
scenario using user-supplied stack memory that I unmap after terminating the
thread. Similarly, for recycled thread stacks the thread identifiers might get
recycled, which can lead to a thread being profiled that did never register for
it (I don't think that's much of an issue though since we assume all threads
should register).

I have tested on a modern NPTL-based system (kernel 2.6.38.8 / eglibc 2.11.1) 
and an older
LinuxThreads-based system (kernel 2.4.27 / glibc-2.3.2). Unit tests are passing,
and the signal handlers seem to fire at the right times.

Note that a much simpler option for solving the problem would be to assume that 
we have a
proper per-process setitimer implementation (that assumption holds for kernels 
2.6.10
and later), in which case we could just delete the timer mode detection code and
not do the signal trick. Not sure what the backwards compatibility policy of 
perftools is.

Let me know if you have questions!

Original comment by chapp...@gmail.com on 2 Mar 2012 at 4:18

GoogleCodeExporter commented 9 years ago
So far the patch sounds reasonable although I can't help but feel that there is 
some level of risk in this one. Also, I would like a bit of clarification since 
I am not overly familiar with the cpu profiler. In my experience with the cpu 
profiler, you just link against the shared library and set CPUPROFILE in your 
environment. This enables global profiling of the entire process. Is this patch 
intended to work in the same way or would this patch only apply to those who 
are manually registering which threads they want to have profiled?

Original comment by chapp...@gmail.com on 3 Mar 2012 at 7:57

GoogleCodeExporter commented 9 years ago
Regarding you wariness about the patch, I do agree that it changes the way of 
doing things quite a bit, so I very much appreciate a proper review :)

Regarding your questions:

Just linking the shared library and setting CPUPROFILE is sufficient on 
POSIX-compliant pthread implementations (i.e. those that have per-process 
timers), as a random thread will receive the timer signal. On 
linuxthreads-based systems, it has always been the case that a 
ProfilerRegisterThread() call was necessary to get profiling signals on that 
thread, both with the old and the new code (see the unit test).

The patch is intended to keep the semantics of the existing code, but improving 
it to only arm the timer(s) in case they're actually in use (i.e. a profiling 
callback is installed). The only difference is that in the new model, you 
should call ProfilerUnregisterThread() on thread shutdown if you've previously 
called ProfilerRegisterThread() on thread startup. If you don't, bad things may 
happen, but as explained previously that only happens under very rare 
circumstances.

I also just noticed that I hadn't properly exposed ProfilerUnregisterThread, 
which is fixed in the updated patch I've attached.

Original comment by mniss...@google.com on 12 Mar 2012 at 3:21

Attachments:

GoogleCodeExporter commented 9 years ago
Great thanks! I am doing a bunch of cleanup tonight and this should be making 
its way onto the list :)

Original comment by chapp...@gmail.com on 4 May 2012 at 5:30

GoogleCodeExporter commented 9 years ago

Original comment by chapp...@gmail.com on 4 May 2012 at 5:30

GoogleCodeExporter commented 9 years ago

Original comment by chapp...@gmail.com on 4 May 2012 at 5:42

GoogleCodeExporter commented 9 years ago
After applying this patch there is a unit test failure running 'make check'. 
Here is the manual run:

david@hatch:~/gperftools$ ./profile_handler_unittest 
threads have shared timers
Running RegisterUnregisterCallback
Check failed: !(IsSignalEnabled())
Aborted

Can you please take a look at this and update the patch or supply a 
supplemental patch.

Regards,

Dave

Original comment by chapp...@gmail.com on 18 Sep 2012 at 3:48

GoogleCodeExporter commented 9 years ago
I'll eventually get back to this, but probably not this week and I'll be on 
vacation till October 6th.

Random question: Is gperftools meant to support pthread with 
non-POSIX-compliant pthread implementations (i.e. per-thread timers)? If not, 
we can as well just remove timer mode detection and assume timers are global. 
FWIW, this is the conclusion we reached in the google-internal discussion about 
this.

Original comment by mniss...@google.com on 19 Sep 2012 at 9:41

GoogleCodeExporter commented 9 years ago
Any update now that you are back. As for the pthread question I have 
successfully run it on FreeBSD with non-POSIX-compliant pthreads without issue. 

Original comment by chapp...@gmail.com on 28 Oct 2012 at 2:28

GoogleCodeExporter commented 9 years ago
I'm quite busy at the moment, but I'll get back to you once I find a few cycles 
to spend on this.

Original comment by mniss...@google.com on 29 Oct 2012 at 8:03

GoogleCodeExporter commented 9 years ago
Just checking in again. I am looking to get a release out very soon now and 
would like to make sure that this is included.

Original comment by chapp...@gmail.com on 22 Dec 2012 at 8:15

GoogleCodeExporter commented 9 years ago
So, I realize I'm being rather unresponsive and I apologize for that. Given 
that is the case, feel free to time out aggressively on me. I'm still 
interested in getting this sorted eventually though, so I will reply 
eventually, like I do today :)

I'll attach an updated patch that adjusts profile_handler_unittest.cc (not sure 
how I could miss that earlier). I basically ripped out all checks for signal 
handler installation (it's installed unconditionally now) and 
shared-vs-non-shared timers (all the timers should be enabled/disabled 
correctly no matter whether they're shared or not). What I haven't tested is 
whether this works on a system with separate timers (I lost the VM I installed 
back then when moving to a new machine). I think we should collect some test 
results from a machine that has separate timers before shipping this though :)

Following up on the question from comment 9: What are your thoughts on just 
dropping support for separate timers? That would simplify the code 
significantly and obviate the need to send signals to other threads just to 
enable timers. Do you have any indication on whether people are still running 
gperftools on systems with separate timers?

I'll be out for a week starting tonight, so I will be slow to respond again 
(sigh!). But when I'm back I'll hopefully have time to either produce a reduced 
patch that drops support for separate timers entirely or test the current code 
in a VM that runs a really old LinuxThreads glibc (depending on your thoughts).

Original comment by mniss...@google.com on 4 Jan 2013 at 3:52

Attachments:

GoogleCodeExporter commented 9 years ago
Sounds great. Thanks for the updated patch. It will be interesting to see how 
the separate timer support pans out.

Original comment by chapp...@gmail.com on 7 Jan 2013 at 1:29

GoogleCodeExporter commented 9 years ago
Back from vacation...

Regarding your comment: Are you interested in (a) seeing the patch in comment 
13 tested on a separate-timer OS or (b) seeing the alternative patch that 
assumes shared timers and drops support for separate timers entirely?

Original comment by mniss...@google.com on 14 Jan 2013 at 4:40