MoritzWillig / pysnic

Python implementation of the SNIC superpixels algorithm
MIT License
54 stars 5 forks source link

How to work with image with more than 3 bands #3

Open purijs opened 3 years ago

purijs commented 3 years ago

I have images/arrays with multiple bands, can we modify it to work for multiple bands? Or is there an alternate way ?

MoritzWillig commented 3 years ago

When looking into the superpixels.py-example you can find a call to snic with all parameters passed explicitly. https://github.com/MoritzWillig/pysnic/blob/f4239e82ee4d37ec2a90847c5295b0c993bab7ef/pysnic/examples/superPixels.py#L32

The lerp attribute of the nd_computations parameter is responsible for computing the average "color" of the centroids. You can simply pass nd_computations["nd"] instead of nd_computations["3"].

The image_distance parameter is responsible for computing the distance between a candidate pixel and the centroids. It uses the (squared) distance metric described in the paper and already works with nd-data. See create_augmented_snic_distance. https://github.com/MoritzWillig/pysnic/blob/f4239e82ee4d37ec2a90847c5295b0c993bab7ef/pysnic/metric/snic.py#L24

For both functions you can, of course, pass custom lerp and distance functions if you need to adjust the weighting of the different channels or image distances.

purijs commented 3 years ago

Okay, understood.

My question was regarding this line. If I have image of 10 bands, this function will fail, I cannot convert my 8 band image to CIELAB color space because it works only on 3 bands.

skimage.color.rgb2lab(color_image).tolist()

What's my alternative here?

MoritzWillig commented 3 years ago

The transformation into the CIELAB color space is done in order to perform the super-pixelation in an perceptually uniform space. The euclidean distance in CIELAB roughly corresponds to the perceived color difference and therefore should hold better results than passing the plain RGB image.

If you have multiple bands of data, you have to find a metric that correctly represents the distance between two pixels. This can either be done by transforming the input (e.g. transforming RGB to CIELAB, normalizing the individual bands, ...) and using the euclidean distance; or by writing a custom metric that handles the different kind bands accordingly (e.g. one that measures the angle between vectors if a band contains rotational data).

If you do not need to transform the data any further, you can just leave out the rgb2lab transform. Just make sure, that you pass in the data as a plain python array. The algorithm performs large amounts of single pixels reads, which are much slower when using numpy arrays.

purijs commented 3 years ago

Thanks for this explanation. As mentioned in the paper, K is the number of centroids, in your script, is numSegments doing the same thing and the number of seeds are calculated according to the shape of image because The total segments come out to be a bit more than what is given as inputs to numSegments

MoritzWillig commented 3 years ago

Yeah right. If you just pass an int for the seeds parameter it tries to compute an equidistant grid of seed positions. The resulting number of super-pixels could turn out a bit higher or lower than the given number. If you need more control over the seeds, you can also pass a list of positions into the seed parameter which will then guaranteenumSegments := len(seeds).

purijs commented 3 years ago

okay, understood! but this dosen't work well on TIFs, works fine on PNG/JPG. I feel it is not taking 100% stretch or maybe not using the multiple bands values and hence the segmentation is not happening properly as on PNGs. Is there some Tif pre-processing to be done for tifs? I'm downloading tifs from Earth Engine and passing the rgb bands here into SNIC

segmentation, distances, numSegments = snic(
    color_image.tolist(),
    seeds,
    compactness, nd_computations["nd"], distance_metric,
    update_func=lambda num_pixels: print("processed %05.2f%%" % (num_pixels * 100 / number_of_pixels)))
MoritzWillig commented 3 years ago

The image format should not matter, as the function expects an 3d array height * width * channels. You may want to make sure that all data is presented as float or int.

However, I quickly looked at the channel ranges in the demos and found the value range to be [0..100, -128..128, -128..128]. From the demo image I observed [0..92, -31..80, -58..63]. With an image size of 400x600 and a compactness of 10.0, you get a distance of 0.02 * pixel_distance + 0.1 * color_distance.

When normalizing channels to [0..1] the color component does not have enough weight and the distance function takes over. With normalized values you have to adjust the compactness. I think a compactness between 0.01 and 0.001 should work fine. Thanks for pointing out that problem, I will add it to the documentation.

purijs commented 3 years ago

Hmm, Im still confused. What do I have to change at my end? The compactness?

MoritzWillig commented 3 years ago

Yeah, you have to adjust the compactness term. Internally the default metric is composed of the distance between the centroid and a candidate pixel, and the difference in color of the centroid and the candidate pixel.

If you have a look at snic_distance_mod you can see that both, the image distance and color difference terms, are weighted by the factors si and mi. Maybe it is better for your case to adjust these factors directly, instead of using create_augmented_snic_distance.

If we would only want to use the color term, it could happen, that a single cell covers the whole image. To prevent this from happening we want to penalize the inclusion of pixels that are too far away from the centroid. You have to try out (or calculate) some compactness values, so that after some time (maybe 1-3 times the initial seed distance) the distance penalization takes over.

purijs commented 3 years ago

So I was able to run something recently with a 14-band image, numsegments = 300 and compactness = 0.01

However, the SNIC doesn't seem to run till 100%

processed 98.28%
processed 98.81%
processed 99.34%
processed 99.87%

What could be a reason? Also, in my final image, I'm not getting many closed polygons. What could be a possible fix here. This doesn't happen for simple rgb png file

MoritzWillig commented 3 years ago

The "processed 100%" line missing, is a bug with logging. I just created an issue for that - #4 . If the function returns, all pixels got processed.

However I'm not quite sure about the "many closed polygons" part. Every given seed results in a super-pixel of at least 1 pixel extend. The super-pixels should always be a single compact area in the image. If you see some large super-pixels and many single-pixel super-pixels, this may be an indication that the distance penalization is not strong enough. Maybe a higher compactness is needed (0.1, 1.0?). To see if there are single pixel super-pixel it may be better to plot the resulting segmentation directly using one of the qualitative color maps of matplotlib.

Edit: The 'prism'-cmap worked quite well for visualization.

plt.figure("segmentations")
plt.imshow(segmentation, cmap="prism")
plt.show()
purijs commented 3 years ago

Ignoring the color space conversion, is it possible to just implement spectral euclidean distance in your script?

MoritzWillig commented 3 years ago

As said before, the algorithm is sensitive to the magnitudes of the input data. So the spectral euclidean distance would be the euclidean distance with the correct weighting applied to the different bands. If you could provide a sample of your data as a *.npy file, I can try to work out the needed parameters for your setting or adjust the code accordingly .

purijs commented 3 years ago

Sure, however, while I was checking the seed inputs. In the case where I just give int seed or the iterable, you are calling compute_grid in both the cases, in one case you're calling in the snic function, in other case I'm extracting seeds from the compute_grid function. Just trying to understand, how do these differ?

MoritzWillig commented 3 years ago

If you look at the shape of the seeds, you see that the grids do not differ. In the example I just wanted to show that the snic function also accepts a list of positions. The other lines just flatten out the array, as compute_grid creates a w*h*2-array, while we need an n*2 array afterwards.

purijs commented 3 years ago

In you recent description, you mentioned, performance can be optimized by using raw array before passing to SNIC. However, we're passing image.to_list and apparently the array module doesn't take sub-lists as in our case.

What's the best way to get around it? The intent is to optimize the performance, is their a possible way to implement this using dask or spark.

MoritzWillig commented 3 years ago

In you recent description, you mentioned, performance can be optimized by using raw array before passing to SNIC. However, we're passing image.to_list and apparently the array module doesn't take sub-lists as in our case. What's the best way to get around it?

Oh, that's bad phrasing from my side: With "raw" array I meant python arrays (x=[1,2,3,4]) and not the array module, in contrast to other containers like numpy.ndarray or PIL.image. So, passing the array obtained by image.to_list() is the recommended way.

The intent is to optimize the performance, is their a possible way to implement this using dask or spark.

I guess that using a C++ implementation of the algorithm and passing in the data from python would already give a significant speedup. The standard SNIC algorithm itself uses a priority queue which makes it hard to parallelize. Dropping the global priority queue and splitting it into multiple local ones should speed up the algorithm. This may result in a degradation of the segmentation quality. Either way, one needs to find a way to efficiently resolve contested pixels between the local queues.

purijs commented 3 years ago

Is there a way I can convert this into a gpu implementation? What all functions could possibly be converted to use cuda under the hood for quicker inference

MoritzWillig commented 3 years ago

In my opinion the primary problems with parallelizing the standard SNIC algorithm are the centroid update after each queue pop and the global queue in general. SLIC proposes the opposite approach, where none of the associations are interlocked and the pixel labels are only assigned at the end of an iteration.

1.) We may want to make the assumption that we only need the priority queue to resolve pixels near segment borders. So it might be an interesting approach to first apply one or two iterations of SLIC and only afterwards use a priority queue to refine the segment borders.

2.) Assuming that far-apart centroids will not compete over the same pixels, we can split up the global priority queue and use a per-kernel priority queue. We would then only need to resolve conflicts between neighbouring queues.

In case you are also using the polygonization: The segmentation to graph conversion can be done with a simple reduction.

I also found a new paper "NICE: Superpixel Segmentation Using Non-IterativeClustering with Efficiency" by Cheng Li et al. which builds up on SNIC and seems to reduce the runtime. However I haven't looked into the used optimizations and quality of the results yet.