Open huonw opened 2 years ago
Mm... sorry for the trouble.
This is because our local process sandbox doesn't use graceful shutdown for processes, and instead kills them immediately with SIGKILL.
We do actually have an implementation of graceful shutdown: https://github.com/pantsbuild/pants/blob/db07a12dd55dd5d9bda5518ac91961720582f7c6/src/rust/engine/process_execution/src/children.rs#L12-L22 ... but it is only used for interactive processes currently.
So it sounds like we should use that code with local process sandbox cleanup, too?
So it sounds like we should use that code with local process sandbox cleanup, too?
Yea.
I've got another use case for graceful shutdown of test
targets in particular. I stand up database fixtures using docker containers during test runs (using pytest
). Under normal circumstances, these are cleaned up when the tests exit, but when the tests restart due to file system changes, it can leave lingering docker containers running.
Pytest naturally handles SIGINT
and runs fixture cleanup by default, so sending SIGINT should resolve my issue with out any additional code changes.
An additional useful feature request would allow adjustable time between the SIGINT and SIGKILL signals. Docker containers can take a while to tear down, so configuring the timeout in the BUILD files may be necessary to make this really work for my use case.
[Orig title: './pants fmt ::' with black leaves orphaned multiprocessing processes when retried due to filesystem changes: - Benjy]
Describe the bug
We use pants on a moderate Python mono-repo. It's large enough that
./pants fmt ::
takes a bit of time to run black. If we touch a file during that time (even without making changes, just updating the mtime), black retries (yay), but seems to leave a handful of multiprocessing background workers around forever, orphaned and never exiting. These processes build up and can exhaust machine resource limits.(This is exacerbated by #16727, since that issue means that a repo may have black running in the background constantly, and thus makes it incredibly likely that some of those invocations will be interrupted.)
Reproducer: https://gist.github.com/huonw/4718b6d146634a66fe9c5e906d76e501
That script creates a whole lot of files so that black runs for a while, and then touches a file while the
./pants fmt ::
invocation runs. It runspgrep -fl pants
to show the pants processes running before/after. I ranpkill -fl pants
first, and the output was something like:Note how the 3 separate lists of processes (starting with
16572 pantsd ...
) change:./pants fmt ::
with retry: there's a bunch of extra processes associated with a named cacheIf the
touch files/f000.py
is commented out, all of the process lists are the same as the first (just pantsd).Pants version 2.14.0rc0
OS macOS
Additional info
Thanks for pants!