MannLabs / SPARCSpy

9 stars 6 forks source link

Accelerate segmentation postprocessing with sparse matrices #31

Open tatyana-perlova opened 4 months ago

tatyana-perlova commented 4 months ago

If you use sparse instead of dense matrices for boolean indexing in segmentation workflows it should get much faster (at least for the cases when majority of image pixels are background and not cell). E.g. in workflows.py instead of

for nucleus_id in all_nucleus_ids:
   nucleus = (masks_nucleus == nucleus_id)

use

from scipy.sparse import csr_matrix

masks_nucleus_sparse = csr_matrix(masks_nucleus)
for nucleus_id in all_nucleus_ids:
    nucleus = (masks_nucleus_sparse == nucleus_id)

And similarly for other cases of boolean indexing with segmentation masks.

sophiamaedler commented 3 months ago

HI :) This sounds like a great suggestion, I have unfortunately not had the time to test it out or benchmark it yet. I'd image there are a couple of places in the code where we would need to implement sparse matrix support but it should not be too complicated. Potentially it could make sense to implement the sparse matrix only for the filtering step and save the final results back to a dense matrix to ensure down stream compatibility with all other processing steps.

In case you'd like to make a PR we'd love to have you contribute :) Feel free to reach out if there is anything specific you'd need help with.