ScalableCytometryImageProcessing / SCIP

Scalable Cytometry Image Processing (SCIP) is an open-source tool that implements an image processing pipeline on top of Dask, a distributed computing framework written in Python. SCIP performs projection, illumination correction, image segmentation and masking, and feature extraction.
https://scalable-cytometry-image-processing.readthedocs.io/en/latest/
GNU General Public License v3.0
7 stars 0 forks source link

Benchmarking scalability of data loading + masking #17

Closed MaximLippeveld closed 3 years ago

MaximLippeveld commented 3 years ago

To prove that horizontal scaling is useful, we want to measure runtime on a dataset for increasing parallelization. Concretely, we want to measure runtime in seconds in function of amount of executors on the PBSCluster. The hypothesis is that runtime initially decreases as more executors are used, but starts increasing again once overhead becomes significant.

The amount of executors is governed by two parameters: n_workers and processes. The former defines how many jobs are spawned (one job = one prism node), the latter defines in how many processes each jobs is split. Executors then equals n_workers * processes.

We want to write a script that launches the sip command for varying configurations and registers runtime.