MoritzWillig / pysnic

Python implementation of the SNIC superpixels algorithm
MIT License
54 stars 5 forks source link

How to process very large image matrix example : 100000X100000X3 #9

Open Atumyugi opened 2 years ago

Atumyugi commented 2 years ago

I try to process TIFF image for GEE, but it cant work, if I want to modify the source code to be multi-process, what parts need to be modified?

MoritzWillig commented 2 years ago

One can assume that two sufficiently distance superpixels do not affect each other (- depending on the value range in the image the distance term of the metric will always dominate after some point). So I think a reasonable approach is to split the image into multiple patches and process them independently. Afterwards you would need to handle the superpixels at the patch borders. Depending on your application it may be okay to just leave them as they are.

Otherwise you would need to remove all superpixels that touch the patch borders and reapply the algorithm: 1.) Select a border between two patches. 2.) snic.py, line 95: Initialize the label_map with the superpixelation, but set all labels of superpixels that touch the borders to "-1". 3.) snic.py, line 129: Start new seed indices with NUM_SEEDS_ALREADY_USED+k instead of k 4.) Make sure to place the starting seeds within the cleared area. After processing the borders, you want to repeat these steps also for all edges where four patches meet.

Due to the sequential nature (=internal use of a priority queue) it is a bit tricky to adapt SNIC to parallel processing and the things I described are just a quick workaround. Depending on your application you may want to have look at other superpixelation algorithms like e.g. SLIC which should be easier to parallelize. If you are planning to process large amounts of data you may also want to consider using the original C++ code which is significantly faster (https://www.epfl.ch/labs/ivrl/research/snic-superpixels/)

Atumyugi commented 2 years ago

Thank you very much for your reply