gperftools / gperftools

Main gperftools repository
BSD 3-Clause "New" or "Revised" License
8.44k stars 1.5k forks source link

profile filename doesn't include pid if child doesn't call exec() #601

Open alk opened 9 years ago

alk commented 9 years ago

Originally reported on Google Code with ID 598

What steps will reproduce the problem?
Our application has a parent process that forks several children, but some of these
children don't call exec().

What is the expected output? What do you see instead?
When the application exits, the children that didn't call exec all get dumped to the
same file overwriting each other:

Dumping heap profile to normal.0001.heap (Exiting)
Dumping heap profile to normal.0001.heap (Exiting)
Dumping heap profile to normal.0001.heap (Exiting)

What version of the product are you using? On what operating system?
2.1 on linux.

Please provide any additional information below.

The problem is that GetUniquePathFromEnv() doesn't seem to be called if the child process
doesn't call exec().  So why not just always include the pid in the filename?   This
patch works for me:

--- gperftools-2.1.orig/src/base/sysinfo.cc     2013-07-30 10:12:11.000000000 +0100
+++ gperftools-2.1/src/base/sysinfo.cc  2014-01-13 17:42:53.634101186 +0000
@@ -192,13 +192,7 @@
   char* envval = getenv(env_name);
   if (envval == NULL || *envval == '\0')
     return false;
-  if (envval[0] & 128) {                  // high bit is set
-    snprintf(path, PATH_MAX, "%c%s_%u",   // add pid and clear high bit
-             envval[0] & 127, envval+1, (unsigned int)(getpid()));
-  } else {
     snprintf(path, PATH_MAX, "%s", envval);
-    envval[0] |= 128;                     // set high bit for kids to see
-  }
   return true;
 }

diff -wur gperftools-2.1.orig/src/heap-profiler.cc gperftools-2.1/src/heap-profiler.cc
--- gperftools-2.1.orig/src/heap-profiler.cc    2013-07-30 10:12:11.000000000 +0100
+++ gperftools-2.1/src/heap-profiler.cc 2014-01-14 12:40:14.886714613 +0000
@@ -228,8 +228,8 @@
   // Make file name
   char file_name[1000];
   dump_count++;
-  snprintf(file_name, sizeof(file_name), "%s.%04d%s",
-           filename_prefix, dump_count, HeapProfileTable::kFileExt);
+  snprintf(file_name, sizeof(file_name), "%s_%u.%04d%s",
+           filename_prefix, (unsigned int)getpid(), dump_count, HeapProfileTable::kFileExt);

   // Dump the profile
   RAW_VLOG(0, "Dumping heap profile to %s (%s)", file_name, reason);

That gives the following output:

Dumping heap profile to normal_7880.0001.heap (Exiting)
Dumping heap profile to normal_7879.0001.heap (Exiting)
Dumping heap profile to normal_7874.0001.heap (Exiting)

Alternatively pthread_atfork could be used? 

Reported by ldarby@brocade.com on 2014-01-14 12:47:57

alk commented 9 years ago
(This was profiling an existing executable that wasn't built with tcmalloc by using
LD_PRELOAD, so changing the behaviour of the child after the fork wasn't an option.
(I work with the reporter.))

Reported by alan.braggins on 2014-01-14 15:17:29

alk commented 9 years ago
Good idea. But I believe patch requires a bit more work. Particularly GetUniquePathFromEnv
is also used by libprofiler and your change breaks old behavior but does not add new
behavior.

How about moving gitpid logic into that function and making it optional based on other
environment variable (like GPERFTOOLS_INCLUDE_PID or maybe GPERFTOOLS_DONT_INCLUDE_PID
if default is to include pids).

Additionally, I'm not sure if we need to keep old behavior by default or not. I.e.
some folks might have scripts that rely on old names without pids at the end.

Reported by alkondratenko on 2014-01-18 21:45:27

alk commented 9 years ago
This may be the proper fix:

--- gperftools-2.1.orig/src/heap-profiler.cc    2013-07-30 10:12:11.000000000 +0100
+++ gperftools-2.1/src/heap-profiler.cc 2014-01-24 19:01:34.337479000 +0000
@@ -228,6 +228,18 @@
   // Make file name
   char file_name[1000];
   dump_count++;
+  
+  char* envval_ = getenv("HEAPPROFILE");
+  if (envval_[0] & 128) {
+         //High bit is set, so we're a child, so re-inititialise
+         //filename_prefix with the pid.  See
+         //https://code.google.com/p/gperftools/issues/detail?id=598
+
+         //TODO: This only needs to be called if the child hasn't called
+         //exec().
+         GetUniquePathFromEnv("HEAPPROFILE", filename_prefix);
+  } 
+ 
   snprintf(file_name, sizeof(file_name), "%s.%04d%s",
            filename_prefix, dump_count, HeapProfileTable::kFileExt);

That should result in no behaviour change, and only adding the pid to child processes
that didn't have them before.  However, I'm having trouble testing it because I'm running
into an issue where my shell's HEAPPROFILE keeps getting its high bit set (i.e. HEAPPROFILE=/logs/normal
becomes HEAPPROFILE=¯logs/normal, i.e. the first / turned into a non-ascii char), i.e.
my shell now looks like the parent, so the parent process of the application now gets
the pid in the filename. I can't see why that's happening because my shell wasn't started
with LD_PRELOAD=libtcmalloc.

Well, that's as far as I've got, sorry but I don't think I'll be able to find any more
time to look at this.

Reported by ldarby@brocade.com on 2014-01-24 19:07:51

alk commented 9 years ago
Unfortunately because of this bug, heap profiling with tcmalloc unusable for me. I would
propose appending pid by default as is valgrind doing.

Reported by marko@kevac.org on 2015-08-07 19:26:01

gklingler commented 9 years ago

Want to note that the patch from ldarby crashes if HEAPPROFILE is not set in the env. Why not just:

diff --git a/src/heap-profiler.cc b/src/heap-profiler.cc
index 17d8697..86e64d7 100755
--- a/src/heap-profiler.cc
+++ b/src/heap-profiler.cc
@@ -231,8 +231,8 @@ static void DumpProfileLocked(const char* reason) {
   // Make file name
   char file_name[1000];
   dump_count++;
-  snprintf(file_name, sizeof(file_name), "%s.%04d%s",
-           filename_prefix, dump_count, HeapProfileTable::kFileExt);
+  snprintf(file_name, sizeof(file_name), "%s-%05d.%04d%s", 
+          filename_prefix, getpid(), dump_count, HeapProfileTable::kFileExt);

   // Dump the profile
   RAW_VLOG(0, "Dumping heap profile to %s (%s)", file_name, reason);
alk commented 8 years ago

The concern with always appending pid is it might break some folk's scripts.