fitodic / centerline

Calculate the polygon's centerline
https://centerline.readthedocs.io
MIT License
266 stars 55 forks source link

A ~30 time faster implementation using shaplely 2.0 and Geopandas ? #42

Open Turtle6665 opened 1 year ago

Turtle6665 commented 1 year ago

Hi @fitodic !

I think I found a (much) faster way to perform those calculations. Or at least, it is with the case with the data I tested with. This is the python script I used to compare them:

from centerline.geometry import Centerline
import shapely
import geopandas as gpd
from timeit import default_timer as timer
def construct_centerline(input_geometry, interpolation_distance=0.5):
    #find the voronoi verticies (equivalent to Centerline._get_voronoi_vertices_and_ridges())
    borders = input_geometry.segmentize(interpolation_distance) #To have smaler verticies (equivalent to Centerline._get_densified_borders())
    voronoied = shapely.voronoi_polygons(borders,only_edges=True) #equivalent to the scipy.spatial.Voronoi

    #to select only the linestring within the input geometry (equivalent to Centerline._linestring_is_within_input_geometry)
    centerlines = gpd.GeoDataFrame(geometry=gpd.GeoSeries(voronoied.geoms)).sjoin(gpd.GeoDataFrame(geometry=gpd.GeoSeries(input_geometry)),predicate="within")
    return centerlines.unary_union

start = timer()
open_area_lines = Centerline(open_area_Road[0],interpolation_distance=2)
end = timer()
centerlines = construct_centerline(open_area_Road[0],interpolation_distance=2)
end2 = timer()
Print(f" Duration current implementation : {end-start:0.3f}s \n Duration proposed implementation : {end2-end:0.3f}s")

In my testing, the proposed version takes 8.580s and the current version of Centerline takes 278.823s. Both versions give fairly similar results in therm of centerline network (see figure). I suppose the difference comes mainly from the segmenting algorithm, but I haven't explored this much. Note that the polygone.segmentize() is only available since version 2.0.0 of the shapely library.

Centerlines

I suppose the main speed-up comes from the way the inside vertices are calculated, as using the spatial join from geopandas is really fast.

I know this way of measuring calculation speed is not perfect and multiple times should at least be averaged out and with different initial polygons. But given the speed difference in my testing, I thought the difference was clear enough to not bother with more extensive testing for now.

To stay constant with the current behavior of the library, I transformed the geoDataFrame back to a shapely multiline. But I figured that is you implement those changes, the Centerline could stay a geoDataFrame as those are less difficult to work with (That's my opinion obviously).

Have a nice day

Alive59 commented 8 months ago

Hi @fitodic !

I think I found a (much) faster way to perform those calculations. Or at least, it is with the case with the data I tested with. This is the python script I used to compare them:

from centerline.geometry import Centerline
import shapely
import geopandas as gpd
from timeit import default_timer as timer
def construct_centerline(input_geometry, interpolation_distance=0.5):
   #find the voronoi verticies (equivalent to Centerline._get_voronoi_vertices_and_ridges())
   borders = input_geometry.segmentize(interpolation_distance) #To have smaler verticies (equivalent to Centerline._get_densified_borders())
   voronoied = shapely.voronoi_polygons(borders,only_edges=True) #equivalent to the scipy.spatial.Voronoi

   #to select only the linestring within the input geometry (equivalent to Centerline._linestring_is_within_input_geometry)
   centerlines = gpd.GeoDataFrame(geometry=gpd.GeoSeries(voronoied.geoms)).sjoin(gpd.GeoDataFrame(geometry=gpd.GeoSeries(input_geometry)),predicate="within")
   return centerlines.unary_union

start = timer()
open_area_lines = Centerline(open_area_Road[0],interpolation_distance=2)
end = timer()
centerlines = construct_centerline(open_area_Road[0],interpolation_distance=2)
end2 = timer()
Print(f" Duration current implementation : {end-start:0.3f}s \n Duration proposed implementation : {end2-end:0.3f}s")

In my testing, the proposed version takes 8.580s and the current version of Centerline takes 278.823s. Both versions give fairly similar results in therm of centerline network (see figure). I suppose the difference comes mainly from the segmenting algorithm, but I haven't explored this much. Note that the polygone.segmentize() is only available since version 2.0.0 of the shapely library.

Centerlines

I suppose the main speed-up comes from the way the inside vertices are calculated, as using the spatial join from geopandas is really fast.

I know this way of measuring calculation speed is not perfect and multiple times should at least be averaged out and with different initial polygons. But given the speed difference in my testing, I thought the difference was clear enough to not bother with more extensive testing for now.

To stay constant with the current behavior of the library, I transformed the geoDataFrame back to a shapely multiline. But I figured that is you implement those changes, the Centerline could stay a geoDataFrame as those are less difficult to work with (That's my opinion obviously).

Have a nice day

Thanks for the brilliant solution, but the function should be modified to fit the latest version of GeoPandas:

def construct_centerline(input_geometry, interpolation_distance=0.5):
    borders = input_geometry.segmentize(interpolation_distance)
    voronoied = shapely.voronoi_polygons(borders, only_edges=True)

    centerlines = gpd.sjoin(gpd.GeoDataFrame(geometry=gpd.GeoSeries(voronoied.geoms)), gpd.GeoDataFrame(geometry=gpd.GeoSeries(input_geometry)), op="within")
    return centerlines.unary_union

Have a nice day!