Open ronawho opened 5 years ago
Looking at the implementation of ibv_fork_init
what it does is use madvise(MADV_DONTFORK)
on the registered memory regions. This is one cause of the additional overhead, along with the overhead of keeping track of the memory regions (they appear to implement a red/black tree for this). The MADV_DONTFORK
prevents the child from accessing the region after the fork
. As noted by @gbtitus in Cray/chapel-private#2019, this won't work for Chapel because the thread stacks are in the heap which is in registered memory regions so the threads of the child process won't be able to access their own stacks.
tl;dr -- the spawn module doesn't work correctly under gasnet-ibv. You can work around this by setting
IBV_FORK_SAFE=1
, but there may be performance implications from doing so.Likely similar to https://github.com/chapel-lang/chapel/issues/7550, Spawn doesn't work correctly under gasnet-ibv. We originally noticed this with CoMD using Spawn to call
uname
in https://github.com/Cray/chapel-private/issues/311. Here's a simple reproducer from that.It looks like by default
fork()
(and things that call fork) don't work under ibv. There is anibv_fork_init()
that "initializes libibverbs's data structures to handle fork() function calls correctly and avoid data corruption". That can have performance implications as "Calling ibv_fork_init() will reduce performance due to an extra system call for every memory registration, and the additional memory allocated to track memory regions. The precise performance impact depends on the workload and usually will not be significant."I'll investigate the performance overhead when I have time. If it seems small we may want to enable that by default. For users requiring a workaround "Setting the environment variable RDMAV_FORK_SAFE or IBV_FORK_SAFE has the same effect as calling ibv_fork_init()."