seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
GNU Lesser General Public License v3.0
356 stars 42 forks source link

feat: parallel #73

Open william-silversmith opened 2 years ago

william-silversmith commented 2 years ago

Adds parallel flag.

The relabel pass was easy, but pretty marginal in terms of the performance impact. The changes that need to be made to the core equivalence pass are kind of outrageous. It might be better to maintain two separate pieces of code for single and parallel just so there's a sane version.

william-silversmith commented 2 years ago

Got the first pass not crashing. Seems slower at the moment. Will have to figure out why. Looks like this speeds the first pass up, but slows down relabeling. Probably has something to do with making the renumbering array bigger. That can probably be handled with offsets.

william-silversmith commented 1 month ago

Managed to increase perf from 120 MVx/sec to ~280 MVx/sec with 8 cores. Something isn't right, it should scale much closer to 1:1.

Some timing info from a 1GVx volume colored in with 8 copies of connectomics.npy.

Some things to observe:

  1. Allocate is big for parallel = 1 because the epl information was not used correctly (so disregard).
  2. Unify climbs with parallel because more slices need to be processed with increasing parallel to stitch the regions together.
  3. First pass is the most expensive process, and for some reason it is not declining linearly with the number of processors.
image