niuys / gperftools

Automatically exported from code.google.com/p/gperftools
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

A child process may hang when a parent process calls fork(). #175

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
1.A parent process is multi-threaded.
2. One of parent thread calls java_lang_UNIXProcess_forkAndExec, while
other thread is calling malloc().
3. The child process hangs since the lock is copied over.

What is the expected output? What do you see instead?
The child process hangs.

What version of the product are you using? On what operating system?
1.2 on Linux RHEL4

Please provide any additional information below.
This happens because java_lang_UNIXProcess_forkAndExec calls opendir()
between fork() and exec(). The lock should be acquired by the calling 
thread before for(), and be released after fork() by both parent and child.

Original issue reported on code.google.com by hankyup...@gmail.com on 5 Oct 2009 at 6:04

GoogleCodeExporter commented 9 years ago
} 3. The child process hangs since the lock is copied over.

Can you clarify what lock you're talking about?  I take it it's the lock called 
by
malloc()?

I don't know much about the semantics of lock inheritence around forking.  My
understanding of how to make fork safe to use is that between fork and exec you 
can't
a) call any non-reentrant library functions, b) call a function that may 
acquire a
lock.  That's pretty restrictive.  I'm not surprised doing something like 
opendir()
there may cause problems.  In general, I do know that forking is super-hard to 
get
right, so it's not surprising to me that there should be problems like this.

I'm not sure I understand your proposed solution though. It's the client app 
that's
calling fork().  How is it supposed to acquire the proper locks, and then 
release
them?  Are you suggesting we provide an API to do that?  It seems pretty 
complicated,
if so.  Or maybe you had another proposal?

Original comment by csilv...@gmail.com on 13 Oct 2009 at 9:20

GoogleCodeExporter commented 9 years ago
When tcmalloc tries to get a memory chunk from system memory, it acquires a 
(global)
lock so that it can exclusively access system memory. 
Let's assume there are thread A and B in the parent process, and thread A calls
java_lang_UNIXProcess_forkAndExec while thread B was holding this lock. Inside 
of
java_lang_UNIXProcess_forkAndExec, the child process will get a clone of the 
parent
thread (A). Since thread B was holding the lock, this state is also inherited 
to the
child process. When opendir() is called inside of
java_lang_UNIXProcess_forkAndExec(), the child process gets stuck since there is
nobody that will release the lock in the child process. This is not a problem 
for the
parent process since thread B will release it eventually for the parent process.

In the pthread, there is an api called pthread_atfork which provides handles
before/after fork(). My solution was to call this function with proper handlers 
when
tcmalloc is initialized. It seems to be working for linux system but I am not 
sure if
it can be generalized for other systems.

Original comment by hankyup...@gmail.com on 14 Oct 2009 at 6:16

GoogleCodeExporter commented 9 years ago
Ah, I see.  I'm worried, though, that this lock isn't the only one we need to 
keep
track of.  tcmalloc has a lot of locks, including a pageheap lock, a new-handler
lock, heap-checker locks, and at least one profiler lock.  The debug allocator 
adds a
few more, and more may be added in the future.  It seems like each one of these 
could
cause a problem in the situation you describe: thread A holds the lock while 
thread B
forks.  I guess we'll need to keep track of all of them?

I haven't thought a lot about the fork-safety of tcmalloc.  I wonder if there is
anything other than spinlocks that we need to worry about.

It sounds like you have a patch that solves the problem for you?  Do you mind
attaching it to this bug report?

Original comment by csilv...@gmail.com on 14 Oct 2009 at 2:41

GoogleCodeExporter commented 9 years ago
I talked to one of the experts here on forking, and he pointed out that it's 
actually
correct for both parent and child to have the fork: the system memory is a 
shared
resource, and they can't both be accessing it at the same time.  The solution 
you use
with pthread_atfork doesn't seem safe.

He had this to say in general:
---
Traditionally, POSIX only ever allowed exec() to be called after fork().
Calling any other function after fork() but before exec() was undefined.

Over time, users discovered that on most systems there were a small number
of functions that they could still call in between fork() and exec() despite
POSIX warning them not to do so. I don't think this list of functions has
ever been formally specified, but it is generally believed that any
async-signal-safe function can be called at this time. And I expect most
runtime environments (including glibc) to try hard to actually guarantee
this behavior.

Please note that opendir() and malloc() are most definitely not
async-signal-safe and still cannot be called safely at this time.

For single-threaded applications, you can sometimes get away with calling
more complex functions after fork(), but this has always been error prone
and would often lead to subtle bugs.

A long time later, POSIX grew a threading API. And it is very clearly an
afterthought. All sorts of things don't quite work correctly once more than
one thread exists in the program. Most notably, calling fork() in a
multi-threaded application is poorly defined and surprisingly difficult to
get right.

Glibc tries hard to make this work. And POSIX made an attempt to fix it by
defining pthread_at_fork() handlers. But all of these are bandaid solutions
that don't work well practice. Apart from the problem with locks, there are
a whole slew of other subtle issues with shared resources that a very
difficult to fix and cause all sorts of very subtle bugs. A particularly
nasty problem are all the resources that are shared between threads and
inherited by child processes (e.g. guaranteeing that file descriptors get
passed correctly in all cases is quite tricky). The upshot is:

 - don't fork() after creating threads. It doesn't work well. If you know
that your application needs to fork(), then create a helper process as the
very first thing in main(). This helper process should stay single-threaded
and can thus fork() more easily.

 - if you cannot use a helper process, then stick to letter of the POSIX
specification. Do not make any function calls other than exec() after
calling fork(). In a multi-threaded application, even the async-signal-safe
functions cannot reliably be called after fork()'ing.

 - alternatively, have a look at posix_spawn(). It's API is somewhat
restricted, but it is often a thread-safe alternative to fork() and exec().
I have no idea how good the implementation in glibc is, though.
---

I don't know if you can do any of these for this java_lang_* call, but I think 
your
code will be safer in general if you can rewrite it so it doesn't do any work 
between
the fork and exec.

I'm going to close this WillNotFix.  I think it would be dangerous to do any
unlocking across forks.

Original comment by csilv...@gmail.com on 14 Oct 2009 at 6:44

GoogleCodeExporter commented 9 years ago
What I was dealing with was a (global) lock from tcmalloc, not a 
system-provided lock
to get a memory from system. I think that the latter is already handled by OS.

The problem is that I don't have a control over java_lang_* methods which is a 
part
of JDK. And this method is called indirectly from my application via an 
underneath
library. So I don't have any control on timing, either.

I am attaching my patch, anyway.

diff -crBN ../google-perftools-1.2/Makefile.am google-perftools-1.2/Makefile.am
*** ../google-perftools-1.2/Makefile.am Fri Apr 17 19:37:09 2009
--- google-perftools-1.2/Makefile.am    Fri Oct  2 04:45:18 2009
***************
*** 309,314 ****
--- 309,315 ----
                                src/page_heap_allocator.h \
                                src/span.h \
                                src/static_vars.h \
+                               src/fork_handler.h \
                                src/thread_cache.h \
                                src/stack_trace_table.h \
                                src/base/thread_annotations.h \
***************
*** 338,343 ****
--- 339,345 ----
                                            src/span.cc \
                                            src/stack_trace_table.cc \
                                            src/static_vars.cc \
+                                           src/fork_handler.cc \
                                            src/thread_cache.cc \
                                            src/malloc_hook.cc \
                                            src/malloc_extension.cc \
diff -crBN ../google-perftools-1.2/src/fork_handler.cc
google-perftools-1.2/src/fork_handler.cc
*** ../google-perftools-1.2/src/fork_handler.cc Thu Jan  1 00:00:00 1970
--- google-perftools-1.2/src/fork_handler.cc    Fri Oct  2 18:51:04 2009
***************
*** 0 ****
--- 1,24 ----
+ #include "fork_handler.h"
+ #include <pthread.h>
+ 
+ namespace tcmalloc {
+ 
+ void prepare()
+ {
+   Static::pageheap_lock()->Lock();
+ }
+ void fork_parent()
+ {
+   Static::pageheap_lock()->Unlock();
+ }
+ void fork_child()
+ {
+   Static::pageheap_lock()->Unlock();
+ }
+ 
+ void Init_forkHandler()
+ {
+   pthread_atfork(prepare, fork_parent, fork_child);
+ }
+ 
+ }
diff -crBN ../google-perftools-1.2/src/fork_handler.h
google-perftools-1.2/src/fork_handler.h
*** ../google-perftools-1.2/src/fork_handler.h  Thu Jan  1 00:00:00 1970
--- google-perftools-1.2/src/fork_handler.h Fri Oct  2 04:45:12 2009
***************
*** 0 ****
--- 1,12 ----
+ #ifndef TCMALLOC_FORK_HANDLERS_H_
+ #define TCMALLOC_FORK_HANDLERS_H_
+ 
+ #include "static_vars.h"
+ 
+ namespace tcmalloc {
+ 
+ void Init_forkHandler();
+ 
+ }
+ 
+ #endif
diff -crBN ../google-perftools-1.2/src/static_vars.cc
google-perftools-1.2/src/static_vars.cc
*** ../google-perftools-1.2/src/static_vars.cc  Thu Feb 26 19:16:18 2009
--- google-perftools-1.2/src/static_vars.cc Fri Oct  2 04:46:11 2009
***************
*** 32,37 ****
--- 32,38 ----

  #include "static_vars.h"
  #include "sampler.h"  // for the init function
+ #include "fork_handler.h"

  namespace tcmalloc {

***************
*** 60,65 ****
--- 61,67 ----
    new ((void*)pageheap_memory_) PageHeap;
    DLL_Init(&sampled_objects_);
    Sampler::InitStatics();
+   Init_forkHandler();
  }

  }  // namespace tcmalloc

Original comment by hankyup...@gmail.com on 14 Oct 2009 at 7:29