Open ChrisFairless opened 3 months ago
@tovogt you understand globes, if you have time I'd appreciate any insight! N.B: It's not urgent (for me)
All polygons in Natural Earth are contained in [-180, 180] longitude. If your extent is not contained within that, the code needs to split the extent up and intersect with two separate bounding boxes, and then translate the lon-coordinates to the lon-range consistent with your original extent. Here is how I would fix it:
def get_country_geometries(country_names=None, extent=None, resolution=10):
"""Natural Earth country boundaries within given extent
If no arguments are given, simply returns the whole natural earth dataset.
Take heed: we assume WGS84 as the CRS unless the Natural Earth download utility from cartopy
starts including the projection information. (They are saving a whopping 147 bytes by omitting
it.) Same goes for UTF.
If extent is provided, longitude values in 'geom' will all lie within 'extent' longitude
range. Therefore setting extent to e.g. [160, 200, -20, 20] will provide longitude values
between 160 and 200 degrees.
Parameters
----------
country_names : list, optional
list with ISO 3166 alpha-3 codes of countries, e.g ['ZWE', 'GBR', 'VNM', 'UZB']
extent : tuple, optional
(min_lon, max_lon, min_lat, max_lat)
Extent, assumed to be in the same CRS as the natural earth data.
resolution : float, optional
10, 50 or 110. Resolution in m. Default: 10m
Returns
-------
geom : GeoDataFrame
Natural Earth multipolygons of the specified countries, resp. the countries that lie
within the specified extent.
"""
resolution = nat_earth_resolution(resolution)
shp_file = shapereader.natural_earth(resolution=resolution,
category='cultural',
name='admin_0_countries')
nat_earth = gpd.read_file(shp_file, encoding='UTF-8')
if not nat_earth.crs:
nat_earth.crs = NE_CRS
# fill gaps in nat_earth
gap_mask = (nat_earth['ISO_A3'] == '-99')
nat_earth.loc[gap_mask, 'ISO_A3'] = nat_earth.loc[gap_mask, 'ADM0_A3']
gap_mask = (nat_earth['ISO_N3'] == '-99')
for idx, country in nat_earth[gap_mask].iterrows():
nat_earth.loc[idx, "ISO_N3"] = f"{natearth_country_to_int(country):03d}"
out = nat_earth
if country_names:
if isinstance(country_names, str):
country_names = [country_names]
country_mask = np.isin(
nat_earth[['ISO_A3', 'WB_A3', 'ADM0_A3']].values,
country_names,
).any(axis=1)
out = out[country_mask]
if extent:
if extent[1] - extent[0] > 360:
raise ValueError(
f"longitude extent range is greater than 360: {extent[0]} to {extent[1]}"
)
if extent[1] < extent[0]:
raise ValueError(
f"longitude extent at the left ({extent[0]}) is larger "
f"than longitude extent at the right ({extent[1]})"
)
# translate the lon range to start in [-180, 180] (but it might exceed to the right)
lon_left = extent[0]
if lon_left != -180:
[lon_left] = lon_normalize(np.array(extent[:1]))
lon_right = lon_left + extent[1] - extent[0]
# split the extent box unless longitude extent is contained within [-180, +180]
extents = [(lon_left, lon_right, extent[2], extent[3])]
if lon_left < -180 or lon_right > 180:
extents = [
(lon_left, 180, extent[2], extent[3]),
(-180, lon_right, extent[2], extent[3]),
]
bboxes = [box(*toggle_extent_bounds(e)) for e in extents]
intersects = []
for bbox in bboxes:
bbox = gpd.GeoSeries(bbox, crs=DEF_CRS)
bbox = gpd.GeoDataFrame({'geometry': bbox}, crs=DEF_CRS)
intersects.append(gpd.overlay(out, bbox, how="intersection"))
# take union of the parts if necessary
if len(intersects) == 1:
[out] = intersects
else:
out_left, out_right = intersects
out = gpd.overlay(out_left, out_right[["geometry"]], how="union")
# rewrap so that all lon coordinates actually lie within specified extent
if extent[0] < -180 or extent[1] > 180:
lon_mid = 0.5 * (extent[0] + extent[1])
# reset the CRS attribute after rewrapping (we don't really change the CRS)
out = (
out
.to_crs({"proj": "longlat", "lon_wrap": lon_mid})
.set_crs(DEF_CRS, allow_override=True)
)
return out
Note however that this might still produce invalid polygons in edge cases. Fixing those would probably require further testing...
@ChrisFairless How to continue here? The fix by @tovogt seems rather complicated. Maybe it would suffice to update the documentation of get_land_geometry
, and even throw an error if the extents exceed the supported range?
I've run into an issue with invalid geometries from Natural Earth when antimeridians are involved.
While this works fine:
all of the following fail:
with
ValueError: No Shapely geometry can be created from null value
I haven't tested exhaustively but I think it fails when
The error occurs at line 687:
The country geometries in
geom
are invalid. Running tests on them explains this:giving
i.e. every country that crosses the 0 degrees line.
I haven't investigated in enough detail to figure out exactly meridian causes problems based in the input extent.
The issue comes from lines 846-850 in
climada.util.coordinates.get_country_geometries
. It's dealing with the case when the requested countries crossed the antimeridian. For some reason it needs to reset the CRS, and this invalidates the geometries.Fixes that didn't work:
Failure 1: force to the (-180, 180) range
I thought about adjusting the longitude
extent
to always be in the(-180, 180)
range. But since the method is called in a lot of places and theextent
parameter is often generated automatically (e.g. during TC Track generation), it's possible that sometimes the returned geometries' longitudes will be offset by 360 degrees from the source data that theextent
was calculated from.Failure 2: fix the geometries
I tried making the geometries valid after the CRS changed (line 850), but it didn't work:
Fixes that work but aren't good
Option 1: nuclear
Throw an error if the
extent
longitudes aren't in the (-180, 180) range. Then slowly update any other parts of CLIMADA that fall over because of this.That's as far as I've got!