MIC-DKFZ / batchgenerators

A framework for data augmentation for 2D and 3D image classification and segmentation
Apache License 2.0
1.1k stars 221 forks source link

Suggestion of multiprocess mechanism in MultiThreadedAugmenter #100

Open SeanCho1996 opened 2 years ago

SeanCho1996 commented 2 years ago

Hi, I noticed that the finish procedure in MultiThreadedAumenter uses the terminate() method of Process to end the child process by sending SIGTERM. https://github.com/MIC-DKFZ/batchgenerators/blob/01f225d843992eec5467c109875accd6ea955155/batchgenerators/dataloading/multi_threaded_augmenter.py#L273-L275

In my project, my main process has a sigterm-handler set up, which was meant to stop the process via SIGTERM at the end of my training, shown as follow:

 def _sigterm_handler(_signo, _stack_frame):
        logger.warn("Terminal signal received: %s, %s" % (_signo, _stack_frame))
        stop_worker()
        exit(0)

However, the following problem occurs when working with MultiThreadedAugmenter's terminate(): when the child process is created, it forks all the methods of the main process, including my sigterm-handler, which causes MultiThreadedAugmenter's ending SIGTERM will be caught by the sigterm-handler, which will directly end my training process.

A temporary solution I came up with is to override the child process with a default signal handler signal.SIG_DFL, so that the SIGTERM of the child process does not trigger the sigterm-handler forked from main process, which means adding one line at the beginning of producer() function:

def producer(queue, data_loader, transform, thread_id, seed, abort_event, wait_time: float = 0.02):
    signal.signal(signal.SIGTERM, signal.SIG_DFL)
    ...

Is it possible that a similar operation needs to be added to the source code to avoid the impact of the child process signal on the main process?

Thank you