Right now, if one of the processes are killed because of OOM or for some various reason killed by the OS, then the software will silently hang and not fail out with an error. This PR works to fail out with an error to make sure the user can run with fewer multiprocessing workers to avoid OOM issues.
PR Checklist
[ ] My PR is less than 500 lines of code
[ ] I have added sufficient comment as docstrings in my code
[ ] I have made corresponding changes to the documentation
[ ] I have written unit-tests to test all of my code
Summary
Right now, if one of the processes are killed because of OOM or for some various reason killed by the OS, then the software will silently hang and not fail out with an error. This PR works to fail out with an error to make sure the user can run with fewer multiprocessing workers to avoid OOM issues.
PR Checklist