Running subimage registration tasks on a single workstation may require a prohibitively long time to run on massive, cloud-based image datasets. We would like to be able to distribute registration tasks among a cluster of worker nodes to execute in parallel.
The itk_dreg framework is built with distributed registration in mind via streaming readers and dask.delayed tasks. However, output serialization is not fully supported in ITK v5.4rc2 or earlier.
ITK v5.4rc3 wheels will include support for unbuffered ITK images introduced in https://github.com/InsightSoftwareConsortium/ITK/pull/4270. That support will allow us to serialize itk.Images describing oriented bounding boxes over which piecewise itk.Transform results are be valid, which is required for distributed processing.
Steps to Investigate
When ITK v5.4rc3 is available on PyPI:
Update pyproject.toml and CI workflows in itk-dreg to use the updated ITK version
Run the localcluster and serialize_pairwise_result tests locally and verify that both tests pass
Re-enable the localcluster and serialize_pairwise_result tests in CI and verify that automated tests pass
For further testing:
Use dask.distributed.LocalCluster to mock a distributed cluster on your local system. Run serialized registration in an example notebook on a LocalCluster and verify that tasks are visible in the accompanying Dask dashboard.
Set up access to a distributed cluster and test distributed registration on the cluster. (xref: Coiled, ACCESS)
Background
Running subimage registration tasks on a single workstation may require a prohibitively long time to run on massive, cloud-based image datasets. We would like to be able to distribute registration tasks among a cluster of worker nodes to execute in parallel.
The
itk_dreg
framework is built with distributed registration in mind via streaming readers anddask.delayed
tasks. However, output serialization is not fully supported in ITK v5.4rc2 or earlier.ITK v5.4rc3 wheels will include support for unbuffered ITK images introduced in https://github.com/InsightSoftwareConsortium/ITK/pull/4270. That support will allow us to serialize
itk.Image
s describing oriented bounding boxes over which piecewiseitk.Transform
results are be valid, which is required for distributed processing.Steps to Investigate
When ITK v5.4rc3 is available on PyPI:
pyproject.toml
and CI workflows initk-dreg
to use the updated ITK versionlocalcluster
andserialize_pairwise_result
tests locally and verify that both tests passlocalcluster
andserialize_pairwise_result
tests in CI and verify that automated tests passFor further testing:
dask.distributed.LocalCluster
to mock a distributed cluster on your local system. Run serialized registration in an example notebook on aLocalCluster
and verify that tasks are visible in the accompanying Dask dashboard.