ImagingDataCommons / highdicom

High-level DICOM abstractions for the Python programming language
https://highdicom.readthedocs.io
MIT License
172 stars 37 forks source link

Multiprocessing for frame encoding in Segmentation constructor #245

Closed CPBridge closed 1 year ago

CPBridge commented 1 year ago

Add the ability to use concurrency to encode frames when creating segmentations.

When using encapsulated transfer syntaxes, the process of encoding frames takes a significant amount of the total time for creation of a Segmentation object. For large multiframe segmentations such as those of WSIs, this can take several minutes. Parallelising the encoding of frames is an obvious place for significant performance gains.

In this PR, I add the ability to use multiprocessing for frame encoding using the Python standard library's concurrent.futures module. The user specifies a workers parameter governing the number of worker processes to spawn (with the default remaining no worker threads). Alternatively they may "bring their own" worker pool by providing an instance of any concurrent.futures.Executor. This has two potential benefits. First it allows a worker pool to be re-used in programs where a large number of segmentations needs to be created to save to overhead of setting up the worker pools. Secondly, it gives the user more control over the multiprocessing context if they want it. For example, in principle, they could pass a ThreadPoolExecutor to use multi-threading rather that multiprocessing.

I also made a few other tweaks to the segmentation array processing for further efficiency gains. Most notably, I adjusted the _get_segment_pixel_array method to return the array for a single frame and a single segment (as opposed to a all frames and a single segment). This makes the looping logic a bit simpler and it turns out that this is slightly faster, presumably because it avoid the need to allocate memory for large arrays.

CPBridge commented 1 year ago

Yes I have definitely thought about decoding with multiple workers too, one for the to-do list.