Closed unidesigner closed 5 years ago
Hi Stephan,
The max_labels argument is a memory reduction hack that's not guaranteed to work well. Typically, I try to estimate it (for connectomics data) as a "representative volume" number of labels plus a large safety factor. In Kimimaro, right or wrong, it is set to 1/4th the number of voxels in the volume.
I might have a better way to reduce memory usage. I'm currently experimenting with removing the "union by size" feature from union-find. In some experiments, it reduces memory usage by half and improves performance ~10% on a set of connectomics labels and on random arrays. However, it doesn't make a ton of sense to me that it's faster.
If I can find a theoretical justification for it I'd be happy to release that, as it's less code and more performant.
I added issue #16 for memory reduction discussion.
@unidesigner Check out this PR and let me know what you think. https://github.com/seung-lab/connected-components-3d/pull/17
@unidesigner I released v1.2.2, which can achieve lower than 2x the previous memory consumption and is also ~40% faster.
@william-silversmith Fantastic news - thanks for getting back to this so quickly! I will test it tomorrow and report back asap.
It works for me too, is much faster and uses less memory! The number of regions it finds is still the same, so so much for an additional datapoint to confirm that things still work. I close this is issue as I don't think there is not much else we can do about the max_label option at the moment.
I'm a bit confused by the max_labels argument. If I run a connected_component call without the argument and then do a np.max(labels_out), I should get the number of components in the version after the recent 1.2.0 release. However, if now use this number with some margin to set max_labels, the procedure fails with exception:
It seems that internally, the union-find algorithm requires a higher number, but it is not clear to me how to estimate this number.
It would be great to find a way to reduce the peak memory footprint of this very nice package. :)