Open mppf opened 7 years ago
When I investigated this issue in detail (in September 2015), I narrowed the problem down to memory in the .data segment being no longer accessible after the fork. In particular this was causing problems with a variable called __fork_generation_pointer in glibc. I was able to generate similar core dumps if I made a C program that madvise(DONT_FORK)'d the data segment and then ran fork().
If this is the only issue, we might be able to solve the problem if we can avoid registering the data segment.
Also, it's not the arguments to fork. We already allocate those with the system allocator (even in a hugepages configuration).
The AI workflow application I have been working on relies on the Spawn
module, and so we need to unload hugepages in order for it to work with the ugni comm layer. The program is not communication-bound today, so this is not a big deal, but that may not always be true. This will likely become an important issue for this application in the future.
cc @gbtitus
Have you tried GNI_CDM_MODE_FORK_PARTCOPY
? At least for this app, if switching to 4k pages with NOCOPY
lets the app work, it seems likely to me that PARTCOPY
will work as well.
Thanks for the suggestion @cassella - I will try that out.
Basically registering hugepage-based regions for everything is incompatible with spawning that doesn't use vfork()
. So currently if you use CHPL_COMM=ugni
and do the style of spawning that doesn't use vfork()
your only option is that comm layer's so-called minimal-registered-memory mode, in which you run without a hugepage module loaded (see here).
Another alternative on XC-based systems would be for the ugni comm layer to register memory as it does now, but without using hugepages. That would oversubscribe the NIC's TLB cache but the performance cost of doing so might be less than that of minimal-registered-memory mode. As luck would have it, we are investigating precisely this, in #10262.
Can't fork when memory is registered with the NIC under ugni. In results in a segfault in the forked process. This is because the Chapel UGNI comm layer uses the GNI_CDM_MODE_FORK_NOCOPY flag.
Note that Spawn calls that forward all output (rather than piping/capturing it) use vfork and so don't have this problem.
For 1.12, we decided to halt if a user is about to call fork with NIC registered memory. See PR #2539.
Long term we need a better solution. We thought about throwing GNI_CDM_MODE_FORK_FULLCOPY instead of GNI_CDM_MODE_FORK_NOCOPY. However, this means that the parent will duplicate the registered memory at the time of the fork. That might involve allocating and duplicating quite a lot of memory - we might get a really slow fork or an OOM.
Perhaps recent comm=ugni dynamic registration improvements are another way to solve this issue.
Steps to Reproduce
First, comment out the UGNI error in Spawn.chpl's spawn function. Then: