Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
If you are using Petastorm in process mode and the main process dies unexpectedly, it leaves the workers running until the user manually kills them. In some environments, this can be quite tricky, especially if you don't SSH access to the box.
The solution
I propose a solution where a each worker runs a separate thread which periodically checks if the main process is running. If the worker sees the main process has died, the worker will exit.
The problem
If you are using Petastorm in process mode and the main process dies unexpectedly, it leaves the workers running until the user manually kills them. In some environments, this can be quite tricky, especially if you don't SSH access to the box.
The solution
I propose a solution where a each worker runs a separate thread which periodically checks if the main process is running. If the worker sees the main process has died, the worker will exit.