pysal / esda

statistics and classes for exploratory spatial data analysis
https://pysal.org/esda
BSD 3-Clause "New" or "Revised" License
215 stars 55 forks source link

Use more precise calculation of minimum bounding circle area #257

Closed martinfleis closed 1 year ago

martinfleis commented 1 year ago

We now compute an area of a minimum bounding circle from its polygonal representation, which is not as precise as it could be. I am replacing it with shapely.minimum_bounding_radius and classic pi times the radius squared formula to avoid that.

The actual numerical difference is negligible in any useful case but there's no reason to not be more precise. And it will likely be faster this way.

ljwolf commented 1 year ago

Looks good to me!

Just to characterise the numerical difference for the future:

import geopandas, geodatasets, shapely

gdf = geopandas.read_file(geodatasets.get_path("geoda ncovr"))

def reock_radius(geom):
    area = shapely.minimum_bounding_radius(geom)**2 * numpy.pi
    return shapely.area(geom) / area

def reock_polygon(geom):
    circ = shapely.minimum_bounding_circle(geom)
    return shapely.area(geom)/shapely.area(circ)

r1 = reock_radius(gdf.geometry)
r2 = reock_polygon(gdf.geometry)

from matplotlib import pyplot as plt 

f,ax = plt.subplots(1,3, figsize=(8,4))
ax[0].hist(r1, histtype='step', density=True)
ax[2].hist(r2, histtype='step', density=True, color='forestgreen')
(r1-r2).hist(ax=ax[1], density=True, bins=100)
ax[0].set_title("radius")
ax[2].set_title("polygon")
ax[1].set_title("differences")
f.tight_layout()
plt.show()

example

Using the returned polygon overstates the index consistently by .6% for US counties ((r2-r1)/r1), or about an average of .003 in magnitude.

ljwolf commented 1 year ago

test failures are due to adbscan, and the shape tests pass locally! I would like to merge this.

martinfleis commented 1 year ago

@ljwolf Go ahead and merge!