mtreinish / stestr

A parallel Python test runner built around subunit
Apache License 2.0
28 stars 33 forks source link

Support for clean termination on receiving SIGNIT #315

Open arun-n2020 opened 3 years ago

arun-n2020 commented 3 years ago

Problem statement/Query In case of a Ctrl-C (what we are actually looking for is an "ABORT" to the Jenkins job running an stestr instance with concurrency), do the main stestr as well as the running workers terminate cleanly? Couldn't find any explicit code in stestr, cascading the signal to the back-end worker processes, but would appreciate suggestions of any reference material that show if this is not needed in the first place :smile:

Additional context Maybe also related to #194

mtreinish commented 3 years ago

Hmm. I think it sort of works today but definitely not cleanly, since sending a signal to terminate the parent process will close the pipe used for interprocess communication causes all the workers to raise an exception when they try to write to the pipe to report the current test status.

I do have have a neglected WIP PR https://github.com/mtreinish/stestr/pull/271 which adds an option to use stdlib multiprocessing instead of subprocess which at least for POSIX systems it should sort of work because it changes how children processes are created to use fork() which POSIX dictates that signals are propagated to children created via fork(). Although this probably will only always work on linux, because I know Python >=3.8 doesn't use fork() for multiprocessing by default on macOS and there might be other edge cases.

I think we probably do want to add some signal handling to stestr run to make the behavior explicit. I think the only way we can do it is register a signal handler in _run_tests after we get a list of processes from the test_processor fixture: https://github.com/mtreinish/stestr/blob/main/stestr/commands/run.py#L667-L713 and on receiving a signal we forward it to each child process. We might also want to add signal handling to the subunit runner here: https://github.com/mtreinish/stestr/blob/main/stestr/subunit_runner/run.py to enable a graceful exit (ie finish the current test or something) for the children processes.

The one potential wrinkle with this plan I think is windows because signals behave differently there. But, that's probably not a concern because if windows doesn't support what we end up with there is nothing we can do about it.

arun-n2020 commented 3 years ago

Thanks Matt, this was very helpful! I'll explore a bit more in this direction and get back to you👍