panosz / alpha_shapes

A Python library for working with alpha shapes
MIT License
45 stars 9 forks source link

Bug in alpha optimization #1

Closed JensLuebbe closed 1 year ago

JensLuebbe commented 1 year ago

Hey Panosz,

first of all, thank you for sharing your code. This is really helpful. However, I found a small bug in the optimization of the parameter alpha. Your current code for the function "optimize":

def optimize(self):

At least N//3 triangles are needed to connect N points.

    simplices = self._sorted_simplices()
    n_start = len(self)//3
    n_finish = len(self)+1
    uncovered_vertices = self._uncovered_vertices(simplices[:n_start])
    for n in range(n_start, n_finish):
        if not uncovered_vertices:
            alpha_opt = 1/np.sqrt(self._sorted_circumradii_sw()[n])   ########### HERE OCCURS THE PROBLEM #############
            shape = self._shape_from_simplices(simplices[:n])
            return alpha_opt, shape
        for vertices in simplices[n]:
            uncovered_vertices.discard(vertices)
    raise OptimizationFailure()

In alpha_shapes.alpha_shapes.py in line 105, it says: alpha_opt = 1/np.sqrt(self._sorted_circumradii_sw()[n])

When n takes on the last value of the for-loop, n takes on the value of the number of simplices. Thus, n is greater by 1 than the last index from the list "simplices". Thus, we get the error n is out of range. For example, we have 7 simplices, then n=7. However, the last index of the list "simplices" is 6.

You can produce this error by optimizing alpha for the following list of coordinates: [(363820.32, 5771887.69), (363837.36, 5771916.33), (363870.03, 5771951.57), (363859.3, 5771943.9), (363829.7, 5771861.92), (363821.03, 5771850.18), (363844.05, 5771928.69), (363828.75, 5771906.28)]

Furthermore, you correctly stated that we need at least N//3 triangles to connect N Points. However, when calculating the minimum number triangles, you do not take the number of points but the number of triangles as N:

n_start = len(self)//3

This does not produce an error, but might lead to a larger number of iterations. Thought I might mention it.

Here is my suggestion for fixing the two issues:

def optimize(self): simplices = self._sorted_simplices()

Fix for the minimum number of triangle issue

    n_points = self.x.size # number of x_coordinates
    n_start = n_points//3 
    n_finish = len(self)+1 
    uncovered_vertices = self._uncovered_vertices(simplices[:n_start])  
    for n in range(n_start, n_finish):
        if not uncovered_vertices:
            # Fix for the out-of-range-issue
            if n == len(self): 
                alpha_opt = 1/np.sqrt(self._sorted_circumradii_sw()[n-1]) 
            else:
                 alpha_opt = 1/np.sqrt(self._sorted_circumradii_sw()[n])
            shape = self._shape_from_simplices(simplices[:n]) 
            return alpha_opt, shape
        for vertices in simplices[n]:
            uncovered_vertices.discard(vertices)
    raise OptimizationFailure()

Best regards

due23 commented 1 year ago

I tried your fix and it still does not work and instead generates a new error:

File ~/Documents/ALA/galah-env/lib/python3.10/site-packages/alpha_shapes/alpha_shapes.py:112, in Alpha_Shaper_Base.optimize(self)
    109     self.set_mask_at_alpha(alpha_opt)
    110     return alpha_opt, shape
--> 112 simplex = simplices[n]
    113 for vertices in simplex:
    114     uncovered_vertices.discard(vertices)

IndexError: index 4783 is out of bounds for axis 0 with size 4783

Also next time could you paste the code snippets in the code blocks with the correct indentation as your solution was quite difficult to interpret.

panosz commented 1 year ago

@JensLuebbe , @due23 Thank you for participating in this issue. I have just released a new version with lots of updates. Can you verify if updating to the new version solves your problem??

Unfortunately this bug persists.

panosz commented 1 year ago

This has to do with strongly shaped data, where the original triangulation is minimally covering and no simplex can be removed without leaving at least one vertex uncovered.

This is now fixed by #2.

due23 commented 1 year ago

Hi!

Thanks for attempting to fix it! Not sure why but it works for some sets of points but not others for me. I have a large data frame of species occurrences with their latitude and longitude coordinates and basically want to plot the alpha shapes of all the coordinates belonging to different decades.

for i, d in reversed(list(enumerate(decades))):
    decade_points = cane_toads[["decimalLongitude", "decimalLatitude"]] [cane_toads["decade"] == d]
    decade_points.drop_duplicates(subset=['decimalLongitude', 'decimalLatitude'], inplace=True)
    shaper = Alpha_Shaper(decade_points)
    alpha_opt, alpha_shape = shaper.optimize()
    print(alpha_opt)

Depending on the decade, some sets of points appear to work fine and produce an alpha shape while others result in the error below. I am being careful to remove duplicate coordinates from my dataframe but still get this error.

---------------------------------------------------------------------------
OptimizationFailure                       Traceback (most recent call last)
Cell In[11], line 5
      3 decade_points.drop_duplicates(subset=['decimalLongitude', 'decimalLatitude'], inplace=True)
      4 shaper = Alpha_Shaper(decade_points)
----> 5 alpha_opt, alpha_shape = shaper.optimize()
      6 print(alpha_opt)

File ~/Documents/ALA/galah-env/lib/python3.10/site-packages/alpha_shapes/alpha_shapes.py:153, in Alpha_Shaper.optimize(self)
    151 def optimize(self):
    152     # At least N//3 triangles are needed to connect N points.
--> 153     n_min = self._get_minimum_fully_covering_index_of_simplices()
    154     alpha_opt = 1 / np.sqrt(self._sorted_circumradii_sw()[n_min]) - 1e-10
    155     simplices = self._sorted_simplices()

File ~/Documents/ALA/galah-env/lib/python3.10/site-packages/alpha_shapes/alpha_shapes.py:149, in Alpha_Shaper._get_minimum_fully_covering_index_of_simplices(self)
    146         return n
    148 if uncovered_vertices:
--> 149     raise OptimizationFailure("Maybe there are duplicate points?")

OptimizationFailure: Maybe there are duplicate points?
panosz commented 1 year ago

@due23 Can you upload the dataFrame decade_points that is giving you trouble, so that I can reproduce the problem?

due23 commented 1 year ago

Hi @panosz ! I've attached a CSV file of the coordinates from the Pandas dataframe that generated the error above. Let me know how it goes! decade_points_2020.csv

panosz commented 1 year ago

Thanks a lot @due23. I will give it a try as soon as possible. For now, I am moving this to a new issue #3

panosz commented 1 year ago

@due23 Please see the comments in #3

Can you check if this works for you?

I have not released a new version in pypi yet, so you will need to clone locally and use

pip install -e .

or something similar.