seung-lab / connected-components-3d

Connected components on discrete and continuous multilabel 3D & 2D images. Handles 26, 18, and 6 connected variants; periodic boundaries (4, 8, & 6)
GNU Lesser General Public License v3.0
356 stars 42 forks source link

Statistics output #114

Closed ckolluru closed 9 months ago

ckolluru commented 9 months ago

In this example, I expect the just one centroid near [149, 149, 149]. Could you let me know where the [64, 64, 64] comes from? Also why are the pixel indices (~149) not the same in all three dimensions with this input?

import numpy as np
import cc3d

labels_in = np.zeros((512, 512, 512), dtype=np.int32) 
labels_in[100:200, 100:200, 100:200] = 1
labels_out, N = cc3d.largest_k(labels_in, k = 1, connectivity = 26, delta=0, return_N = True)

a = cc3d.statistics(labels_out)
a['centroids']

This is what I get:

array([[ 64.480415,  64.480415,  64.480415],
       [149.26369 , 149.49011 , 149.0991  ]], dtype=float32)

Thank you for sharing this software.

william-silversmith commented 9 months ago

Hi!

Thanks for writing in.

The first centroid is for label 0 and the second is for label 1.

I took at look at your example, and it seems the reason is a loss of precision in float32 as its summing across a large area. When I upgrade from float to double, the numbers come out right. Float32 can represent a large range, but only has full precision for integers up to 10^7.

512 511 / 2 512 * 512 = 34e9

I'll see if I can find a nice fix for this.

william-silversmith commented 9 months ago

It looks like this used to be float64 but there was a problem with memory consumption.

https://github.com/seung-lab/connected-components-3d/commit/13d7c4942d45ea612822649abc9ccda302c8799b

Hmm....

ckolluru commented 9 months ago

Thanks for looking into this.

If I change the test image to be ones only between [10:20, 10:20, 10:20] and zeros elsewhere the background centroid is still around [64, 64, 64]. That shouldn’t be the case right?

william-silversmith commented 9 months ago

That's also loss of precision and changing the foreground doesn't affect it that much since the vast majority of background pixels are black in both scenarios. When I change it to double, it gives something like the right result (256.29568989, 256.29568989, 256.29568989) which is still a bit off...

I have to figure out some way to prevent the sum from blowing up so high without sacrificing accuracy and memory efficiency.

william-silversmith commented 9 months ago

It was kind of tricky to figure out a method for doing this that used less memory than simply bumping to double, so I'll release a new version with that once the build completes. The precision for double is actually pretty good, I forgot to account for the large "hole" in the data before.

I'll let you know when the new version is released.

william-silversmith commented 9 months ago

Okay, the latest version is released and should solve the problem.

On Wed, Dec 6, 2023 at 7:04 PM Chaitanya @.***> wrote:

Thanks for looking into this.

If I change the test image to be ones only between [10:20, 10:20, 10:20] and zeros elsewhere the background centroid is still around [64, 64, 64]. That shouldn’t be the case right?

— Reply to this email directly, view it on GitHub https://github.com/seung-lab/connected-components-3d/issues/114#issuecomment-1843901202, or unsubscribe https://github.com/notifications/unsubscribe-auth/AATGQSKS67FL5DCG2JBBL3TYIEBZJAVCNFSM6AAAAABAKA5FHOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNBTHEYDCMRQGI . You are receiving this because you commented.Message ID: @.***>

ckolluru commented 9 months ago

Thanks @william-silversmith!