Open EricLin89 opened 4 months ago
Out of curiosity, are you using preexec_fn
in your Popen()
call? Also, do you have threads in your application?
Out of curiosity, are you using
preexec_fn
in yourPopen()
call? Also, do you have threads in your application?
preexec_fn
is used, the function passed to preexec_fn
is as follows:def _switch_user(self):
if self.group_id:
os.setgid(self.group_id)
if self.user_id:
os.setuid(self.user_id)
which allows the caller to switch to a specific user when calling the command. In this case, the user_id and group_id are not specified, so preexec_fn
is passed but didn't actually do anything.
Do you know any known issues related to use cases when using with preexec_fn
and threads? @shmuelzon
I ran into a similar issue myself, found this ticket without any resolution and started digging deeper.
Do you know any known issues related to use cases when using with preexec_fn and threads?
Apparently, yes. And it's stated in many places, had I bothered to RTFM. See the Popen documentation:
Warning The preexec_fn parameter is NOT SAFE to use in the presence of threads in your application. The child process could deadlock before exec is called.
There are also subtle hints in the Python code which basically say "using preexec_fn
will cause a deadlock but you asked for it". This Stack Overflow answer explains pretty well why this is a problem.
I ran into a similar issue myself, found this ticket without any resolution and started digging deeper.
Do you know any known issues related to use cases when using with preexec_fn and threads?
Apparently, yes. And it's stated in many places, had I bothered to RTFM. See the Popen documentation:
Warning The preexec_fn parameter is NOT SAFE to use in the presence of threads in your application. The child process could deadlock before exec is called.
There are also subtle hints in the Python code which basically say "using
preexec_fn
will cause a deadlock but you asked for it". This Stack Overflow answer explains pretty well why this is a problem.
Ah... Thanks. I tried to look for known deadlock issues all over the internet but failed to notice that obvious warning on the official documentation... I think I would fix the problem by removing the use of preexec_fn... Again, thank you so much. @shmuelzon
I have code calling into pip
and it sometimes gets stuck like this . I don't use preexec_fn
.
I am using asyncio.subprocess_shell
, but this code path is shared between sync and async code.
self = <Popen: returncode: None args: '/tmp/pytest-of-jenkins/pytest-30443/test_exe...>
args = ['/bin/sh', '-c', '/tmp/pytest-of-jenkins/pytest-30443/test_executor_creation_and_ven0/executors/venv/dac033c09055cd63...ython -m pip install pkg2 inmanta-core==15.0.0.dev0 --index-url /tmp/pytest-of-jenkins/pytest-30443/pip_index0/simple']
executable = b'/bin/sh', preexec_fn = None, close_fds = True, pass_fds = ()
cwd = '/'
env = {'ASYNC_TEST_TIMEOUT': '60', ...}
startupinfo = None, creationflags = 0, shell = True, p2cread = -1, p2cwrite = -1
c2pread = 27, c2pwrite = 106, errread = -1, errwrite = 106
restore_signals = True, gid = None, gids = None, uid = None, umask = -1
start_new_session = False, process_group = -1
def _execute_child(self, args, executable, preexec_fn, close_fds,
pass_fds, cwd, env,
startupinfo, creationflags, shell,
p2cread, p2cwrite,
c2pread, c2pwrite,
errread, errwrite,
restore_signals,
gid, gids, uid, umask,
start_new_session, process_group):
"""Execute program (POSIX version)"""
if isinstance(args, (str, bytes)):
args = [args]
elif isinstance(args, os.PathLike):
if shell:
raise TypeError('path-like args is not allowed when '
'shell is true')
args = [args]
else:
args = list(args)
if shell:
# On Android the default shell is at '/system/bin/sh'.
unix_shell = ('/system/bin/sh' if
hasattr(sys, 'getandroidapilevel') else '/bin/sh')
args = [unix_shell, "-c"] + args
if executable:
args[0] = executable
if executable is None:
executable = args[0]
sys.audit("subprocess.Popen", executable, args, cwd, env)
if (_USE_POSIX_SPAWN
and os.path.dirname(executable)
and preexec_fn is None
and not close_fds
and not pass_fds
and cwd is None
and (p2cread == -1 or p2cread > 2)
and (c2pwrite == -1 or c2pwrite > 2)
and (errwrite == -1 or errwrite > 2)
and not start_new_session
and process_group == -1
and gid is None
and gids is None
and uid is None
and umask < 0):
self._posix_spawn(args, executable, env, restore_signals,
p2cread, p2cwrite,
c2pread, c2pwrite,
errread, errwrite)
return
orig_executable = executable
# For transferring possible exec failure from child to parent.
# Data format: "exception name:hex errno:description"
# Pickle is not used; it is complex and involves memory allocation.
errpipe_read, errpipe_write = os.pipe()
# errpipe_write must not be in the standard io 0, 1, or 2 fd range.
low_fds_to_close = []
while errpipe_write < 3:
low_fds_to_close.append(errpipe_write)
errpipe_write = os.dup(errpipe_write)
for low_fd in low_fds_to_close:
os.close(low_fd)
try:
try:
# We must avoid complex work that could involve
# malloc or free in the child process to avoid
# potential deadlocks, thus we do all this here.
# and pass it to fork_exec()
if env is not None:
env_list = []
for k, v in env.items():
k = os.fsencode(k)
if b'=' in k:
raise ValueError("illegal environment variable name")
env_list.append(k + b'=' + os.fsencode(v))
else:
env_list = None # Use execv instead of execve.
executable = os.fsencode(executable)
if os.path.dirname(executable):
executable_list = (executable,)
else:
# This matches the behavior of os._execvpe().
executable_list = tuple(
os.path.join(os.fsencode(dir), executable)
for dir in os.get_exec_path(env))
fds_to_keep = set(pass_fds)
fds_to_keep.add(errpipe_write)
self.pid = _fork_exec(
args, executable_list,
close_fds, tuple(sorted(map(int, fds_to_keep))),
cwd, env_list,
p2cread, p2cwrite, c2pread, c2pwrite,
errread, errwrite,
errpipe_read, errpipe_write,
restore_signals, start_new_session,
process_group, gid, gids, uid, umask,
preexec_fn, _USE_VFORK)
self._child_created = True
finally:
# be sure the FD is closed no matter what
os.close(errpipe_write)
self._close_pipe_fds(p2cread, p2cwrite,
c2pread, c2pwrite,
errread, errwrite)
# Wait for exec to fail or succeed; possibly raising an
# exception (limited in size)
errpipe_data = bytearray()
while True:
> part = os.read(errpipe_read, 50000)
E KeyboardInterrupt
Bug report
Bug description:
I have a python program (let's call it my_prog.py) that starts multiple threads of periodic tasks in one process. Some periodic tasks involvs writing to a file and calls "setfacl" command after the file is written.
I recently observed one of the threads hangs at subprocess.Popen() when trying to calling setfacl command. I managed to print out the stacktrace of all the threads by using py-spy dump. Here are the detailed infos:
Thread 44167 (idle): "Thread-get-lldp-ip"
... some irrelevant stacktraces...
child process:
Thread 89047 (idle): "Thread-get-lldp-ip"
... some irrelevant stacktraces...
Thread 0x14F2A5F37640 (idle): "ComponentManager"
... some irrelevant stacktraces...