Closed sgoodm closed 3 years ago
Not sure if the MultiPolygon info above is actually driving this issue. A check with ogrinfo
results in:
ERROR 1: GeoJSON object too complex, please see the OGR_GEOJSON_MAX_OBJ_SIZE environment option
The following code works, indicating the file is in fact valid, but requires adjusting a common default environment variable in most applications.
>>> with fiona.Env(OGR_GEOJSON_MAX_OBJ_SIZE="2000MB"):
... p4 = gpd.read_file('/home/userv/Downloads/geoBoundaries-JPN-ADM0-all/geoBoundaries-JPN-ADM0.geojson')
...
>>> p4
shapeName Level shapeISO shapeID shapeGroup shapeType geometry
0 Japan ADM0 JPN JPN-ADM0-39117424B90657763 JPN ADM0 MULTIPOLYGON (((123.78700 24.07181, 123.78694 ...
@DanRunfola I suggest just adding a note about dealing with this issue when using GeoJSONs. It doesn't seem to be relevant for other file formats, but will impact most folks using these few boundaries as GeoJSONs.
Example snippet:
import fiona
import geopandas as gpd
path = '/home/userv/Downloads/geoBoundaries-JPN-ADM0-all/geoBoundaries-JPN-ADM0.geojson'
# fails (silently - still shows correct bounds using fiona but has no features. With geopandas the bounds are nans)
with fiona.open(path, 'r') as vector:
print(vector.bounds)
print(len(vector))
with fiona.Env():
vector = gpd.read_file(path)
print(vector.total_bounds)
print(len(vector))
# succeeds
with fiona.Env(OGR_GEOJSON_MAX_OBJ_SIZE="100000MB"):
with fiona.open(path, 'r') as vector:
print(vector.bounds)
print(len(vector))
with fiona.Env(OGR_GEOJSON_MAX_OBJ_SIZE="2000MB"):
vector = gpd.read_file(path)
print(vector.total_bounds)
print(len(vector))
May be worth creating an issue with fiona as this probably should not fail silently.
Closing this with an addition to our wiki: https://github.com/wmgeolab/geoBoundaries/wiki/1.-Technical-Usage-Notes#technical-challenges
Several GeoJSONs from v4 data are broken: JPN_ADM0, NOR_ADM0, NZL_ADM0, PHL_ADM1
Seems to be tied to the format of the MultiPolygon geometries - I believe the Polygons within the MultiPolygon in the broken GeoJSONs are wrapped in one too many lists.
When reading the GeoJSONs using spatial software (fiona, geopandas, QGIS) it will fail due to the bad geometry, but you can read the file as raw JSON to debug.
p2
is the JSON object in the below Python code/outputTrying to recreate a MultiPolygon from the GeoJSON geometry field fails
And if you try to build a MultiPolygon from the underlying elements it also fails:
Yet if we remove the outermost list of each element it works:
Likely the most relevant line of code: https://github.com/wmgeolab/geoBoundaryBot/blob/main/gbBuild.py#L414
However it is not clear why this is not impacting more boundaries. That likely depends more on the source data and other boundary prep that I am not familiar with.
Additional symptom of this is also that metadata spatial stats are broken. See the
nan
values below, as well as the unit count being 0 (yet still showing non-zero vertices count oddly).