The current method of assigning country names by doing a spatial join using the intersects operation in the following line
hexagons_with_country = gpd.sjoin(hexagons, countries, op='intersects') # changed from "within"
will lead to duplicated hexagons at borders where a hexagon intersects more than one country.
Perhaps the behaviour is intentional but if user is not aware may cause an issue when plotting data as a choropleth with fill alpha <1 as the duplicated hexagons that overlap increase apparent alpha or change the color.
Instead, I use the hexagon centroid to assign country names using within join operation which will give a 1-1 relationship in almost all cases (unless the centroid is exactly on a country border) along the lines of the code below:
hexagon_centroids = gpd.GeoDataFrame(geometry=hexagons.centroid, index=hexagons.index, crs=hexagons.crs)hexagons["country"] = gpd.sjoin(hexagon_centroids, countries, predicate='within').country
The current method of assigning country names by doing a spatial join using the intersects operation in the following line
hexagons_with_country = gpd.sjoin(hexagons, countries, op='intersects') # changed from "within"
will lead to duplicated hexagons at borders where a hexagon intersects more than one country.Perhaps the behaviour is intentional but if user is not aware may cause an issue when plotting data as a choropleth with fill alpha <1 as the duplicated hexagons that overlap increase apparent alpha or change the color.
Instead, I use the hexagon centroid to assign country names using within join operation which will give a 1-1 relationship in almost all cases (unless the centroid is exactly on a country border) along the lines of the code below:
hexagon_centroids = gpd.GeoDataFrame(geometry=hexagons.centroid, index=hexagons.index, crs=hexagons.crs)
hexagons["country"] = gpd.sjoin(hexagon_centroids, countries, predicate='within').country