Closed martinfleis closed 1 year ago
Looks good to me!
Just to characterise the numerical difference for the future:
import geopandas, geodatasets, shapely
gdf = geopandas.read_file(geodatasets.get_path("geoda ncovr"))
def reock_radius(geom):
area = shapely.minimum_bounding_radius(geom)**2 * numpy.pi
return shapely.area(geom) / area
def reock_polygon(geom):
circ = shapely.minimum_bounding_circle(geom)
return shapely.area(geom)/shapely.area(circ)
r1 = reock_radius(gdf.geometry)
r2 = reock_polygon(gdf.geometry)
from matplotlib import pyplot as plt
f,ax = plt.subplots(1,3, figsize=(8,4))
ax[0].hist(r1, histtype='step', density=True)
ax[2].hist(r2, histtype='step', density=True, color='forestgreen')
(r1-r2).hist(ax=ax[1], density=True, bins=100)
ax[0].set_title("radius")
ax[2].set_title("polygon")
ax[1].set_title("differences")
f.tight_layout()
plt.show()
Using the returned polygon overstates the index consistently by .6% for US counties ((r2-r1)/r1
), or about an average of .003 in magnitude.
test failures are due to adbscan, and the shape tests pass locally! I would like to merge this.
@ljwolf Go ahead and merge!
We now compute an area of a minimum bounding circle from its polygonal representation, which is not as precise as it could be. I am replacing it with
shapely.minimum_bounding_radius
and classic pi times the radius squared formula to avoid that.The actual numerical difference is negligible in any useful case but there's no reason to not be more precise. And it will likely be faster this way.