ClimateCompatibleGrowth / GeoH2

Geospatial analysis of green hydrogen production costs.
MIT License
13 stars 4 forks source link

assign_country.py hexagon duplication #17

Open scotiow opened 4 months ago

scotiow commented 4 months ago

The current method of assigning country names by doing a spatial join using the intersects operation in the following line hexagons_with_country = gpd.sjoin(hexagons, countries, op='intersects') # changed from "within" will lead to duplicated hexagons at borders where a hexagon intersects more than one country.

Perhaps the behaviour is intentional but if user is not aware may cause an issue when plotting data as a choropleth with fill alpha <1 as the duplicated hexagons that overlap increase apparent alpha or change the color.

Instead, I use the hexagon centroid to assign country names using within join operation which will give a 1-1 relationship in almost all cases (unless the centroid is exactly on a country border) along the lines of the code below: hexagon_centroids = gpd.GeoDataFrame(geometry=hexagons.centroid, index=hexagons.index, crs=hexagons.crs) hexagons["country"] = gpd.sjoin(hexagon_centroids, countries, predicate='within').country