Scalable Cytometry Image Processing (SCIP) is an open-source tool that implements an image processing pipeline on top of Dask, a distributed computing framework written in Python. SCIP performs projection, illumination correction, image segmentation and masking, and feature extraction.
To prove that horizontal scaling is useful, we want to measure runtime on a dataset for increasing parallelization. Concretely, we want to measure runtime in seconds in function of amount of executors on the PBSCluster. The hypothesis is that runtime initially decreases as more executors are used, but starts increasing again once overhead becomes significant.
The amount of executors is governed by two parameters: n_workers and processes. The former defines how many jobs are spawned (one job = one prism node), the latter defines in how many processes each jobs is split. Executors then equals n_workers * processes.
We want to write a script that launches the sip command for varying configurations and registers runtime.
To prove that horizontal scaling is useful, we want to measure runtime on a dataset for increasing parallelization. Concretely, we want to measure runtime in seconds in function of amount of executors on the PBSCluster. The hypothesis is that runtime initially decreases as more executors are used, but starts increasing again once overhead becomes significant.
The amount of executors is governed by two parameters: n_workers and processes. The former defines how many jobs are spawned (one job = one prism node), the latter defines in how many processes each jobs is split. Executors then equals
n_workers
*processes
.We want to write a script that launches the sip command for varying configurations and registers runtime.