rui314 / mold

Mold: A Modern Linker 🦠
MIT License
13.69k stars 448 forks source link

Compilation on EL-family Linux with GCC leaves `ld.mold` zombies. #1292

Closed v-instrumentix closed 1 week ago

v-instrumentix commented 1 week ago

Please use Dockerfile as pasted below (also available here). This Dockerfile installs gcc, cmake, make and mold, defines trivial CMake project that uses mold as linker and provides shell script wrapping cmake + cmake --build followed by check for mold processes left.

Steps to replicate are as follows (assuming Dockerfile in current directory):

Tested with GCC 11, 12, 13 either with Ninja or Make.

[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.3s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         113  0.0  0.0  16452  1096 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.0s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  3.0  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         135  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         150  0.0  0.0  16452  1272 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# bash /x/b
-- Configuring done (0.0s)
-- Generating done (0.0s)
-- Build files have been written to: /x/build
[ 50%] Building CXX object CMakeFiles/a.dir/main.cpp.o
[100%] Linking CXX executable a
[100%] Built target a
root          98  1.0  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         135  1.5  0.0      0     0 pts/0    Z    16:06   0:00 [ld.mold] <defunct>
root         172  0.0  0.0      0     0 pts/0    Z+   16:06   0:00 [ld.mold] <defunct>
root         187  0.0  0.0  16452  1112 pts/0    S+   16:06   0:00 grep mold
[root@969b9d96458c ~]# logout

Dockerfile for building MRE:

ARG SOURCE_IMAGE="almalinux:8"
FROM ${SOURCE_IMAGE}
#
# Install minimal set of utils
#
RUN dnf install -y gcc-toolset-13-gcc-c++ cmake make
RUN dnf install -y epel-release
RUN dnf install -y mold
#RUN dnf install -y --enablerepo powertools ninja-build
WORKDIR /x
#
# Minimal CMake file using MOLD
#
RUN <<EOF cat > CMakeLists.txt
project(molddefunc)
cmake_minimum_required(VERSION 3.25)
add_link_options(-fuse-ld=mold)
add_executable(a main.cpp)
EOF
#
# Script wrapping build steps
RUN <<EOF cat > b
#!/usr/bin/env bash
set -Eeou pipefail
source /opt/rh/gcc-toolset-13/enable
echo 'int main() {}' > /x/main.cpp
cmake -B/x/build -S/x
cmake --build /x/build
ps axuww | grep mold
EOF
CMD ["su", "-"]
rui314 commented 1 week ago

mold spawns a child process to do actual linking to hide the latency of process exit. That is, even exit() takes a few hundred milliseconds for processes with large memory image like the linker, and we hide it by spawning a child process.

Zombie processes are processes that have already terminated, but their parents haven't checked their exit statuses using waitpid.

Usually, when a parent process terminates before its children, the children are adopted by process number 1 (usually /sbin/init), and init calls waitpid (or possibly wait) to reclaim process table slots occupied by these orphaned processes.

bash does that too when it is invoked as pid 1. However, su doesn't do that.

That's why zombie processes remain in your docker environment. This issue is not limited to mold; it is generally assumed in Unix that orphan processes are reclaimed by pid 1, so other program may also leave zombies in your environment.

There are two "solutions" to the problem:

v-val commented 1 week ago

Thank you for your exemplary beautiful answer.