saalfeldlab / n5-spark

Spark-driven processing utilities for N5 datasets.
BSD 2-Clause "Simplified" License
3 stars 7 forks source link

N5 connected components on Spark #18

Closed igorpisarev closed 4 years ago

igorpisarev commented 4 years ago

Adds distributed connected components utility for N5 datasets.

The algorithm is inspired by synapse counting method and @davidackerman's connected components implementation for cosem: in the first pass the components are labeled in each block separately, then touching components are merged across the blocks.

The startup scripts are available for cluster and local use and are named n5-connected-components.py.

It can be also used together with paintera-conversion-helper to convert a raw dataset into a paintera-compatible multiscale label source.