Might be worth trying DBSCAN or OPTICS for the clustering approach. This can be an alternative for finding sites over the segmentation based approach we have at the moment.
One thing that might be tricky are how to deal with the boundary conditions.
I had a look at this today and found several bottlenecks:
Both are distance based, need to define the maximum distance between 2 samples to be considered in the neighbourhood of each other. Needs function to calculate correct distance between fractional coords.
Slow, scales O(n^2), already taking several minutes on my PC for <=2000 samples
Opportunities:
Distance based, so we may supply a metric function that takes into account boundary condition
Might be worth trying DBSCAN or OPTICS for the clustering approach. This can be an alternative for finding sites over the segmentation based approach we have at the moment.
One thing that might be tricky are how to deal with the boundary conditions.