WZBSocialScienceCenter / geovoronoi

a package to create and plot Voronoi regions within geographic boundaries
https://datascience.blog.wzb.eu/2018/03/16/creating-and-plotting-voronoi-regions-for-geographic-data-with-geovoronoi/
Apache License 2.0
135 stars 28 forks source link

Strategy for connecting an input GeoPandas dataframe to a geovoronoi output polygons #13

Closed malmorrow closed 3 years ago

malmorrow commented 3 years ago

I'm trying to generate geovoronoi regions using a GeoDataFrame. The input points come from the dataframe's geometry. Once I have the resulting polygons, I need a mapping of the rows in the dataframe to the regions generated by the library. I don't find any way to do this using the outputs from voronoi_regions_from_coords. Here follows an example program using cities in South Africa. The idea would be to get the right city name back for each Voronoi region. It's not correct at the moment. What am I missing?

import logging

import matplotlib.pyplot as plt
import geopandas as gpd

from geovoronoi import coords_to_points, voronoi_regions_from_coords
from geovoronoi.plotting import subplot_for_map, plot_voronoi_polys_with_points_in_area

logging.basicConfig(level=logging.INFO)
geovoronoi_log = logging.getLogger('geovoronoi')
geovoronoi_log.setLevel(logging.INFO)
geovoronoi_log.propagate = True

mercator = 3395
wsg84 = 4326
hartebeesthoek94 = 4148 # Frequently used for southern Africa
crs = hartebeesthoek94

country = 'South Africa'

print(f'loading country {country} from naturalearth_lowres')
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))

area = world[world.name == country]
assert(len(area) == 1)

print('CRS:', area.crs)

area = area.to_crs(epsg=crs)
area_shape = area.iloc[0].geometry

print('loading local cities from naturalearth_cities' )
cities = gpd.read_file(gpd.datasets.get_path('naturalearth_cities'))
cities = cities.to_crs(epsg=crs)
local_cities = cities.loc[cities['geometry'].within(area_shape)]
assert(len(local_cities) >= 1) # This doesn't work for all countries!

# Calculate the `per_geom=False` voronoi diagram for the country, using the GeoDataFrame's `geometry` column
# to provide the coordinates.
voronoi_regions, voronoi_points = voronoi_regions_from_coords(local_cities.geometry, area_shape, per_geom=False)

fig, ax = subplot_for_map()

plot_voronoi_polys_with_points_in_area(ax, area_shape, voronoi_regions, local_cities.geometry, voronoi_points)

ax.set_title(f'Cities and their Voronoi regions in {country} (with per_geom=False)')

plt.tight_layout()
plt.savefig('national_city_regions.png')
plt.show()

city = 0
assert(city < len(local_cities)-1) # No point looking for cities with higher indexes

print('Goal is to map the original GeoPandas df to the voronoi_regions output')
print('\nvoronoi_points showing dict index:')
pprint(voronoi_points)
print('\nvoronoi_regions showing dict index:')
pprint(voronoi_regions)

print('\nlocal_cities showing index after the filter out of the global cities originally loaded:')
print(local_cities)

print(f'\nlocal_cities.iloc[city]["name"]: {local_cities.iloc[city]["name"]}')

print(f'\nvoronoi_points[city]: {voronoi_points[city]}')
print(f'\nvoronoi_points[city][0]: {voronoi_points[city][0]}')

print(f'\nvoronoi_regions[voronoi_points[city][0]]:')
voronoi_regions[voronoi_points[city][0]]

The region polygon in voronoi_regions[voronoi_points][0]] is Cape Town, not Bloemfontein.

internaut commented 3 years ago

The mapping works. I think you mixed how the mapping in voronoi_points works. It's a dict that maps a region ID to (possibly multiple) point IDs. The point IDs are indices into local_cities.geometry. You should also be aware that this uses array indices 0 to N and not the pandas indices (which are, after filtering, 57, 58, 171, 189 in your case). So to find out the region ID for a given city it would be:

cities_list = local_cities.name.to_list()
city = 0

for reg_id, reg_pts in voronoi_points.items():
    if city in reg_pts:
        print(f'city #{city} {cities_list[city]} is in region #{reg_id}')

# returns city #0 Bloemfontein is in region #3

The whole hassle is because you could theoretically have multiple points in one region (if they're duplicates points or extremely close together). Btw. there's only a function points_to_region to inverse the mapping which makes the whole thing easier:

from geovoronoi import points_to_region

pts_region = points_to_region(voronoi_points)
pts_region[city]

# returns 3
malmorrow commented 3 years ago

Thank you. I can go from here. I appreciate the help!