OSGeo / gdal

GDAL is an open source MIT licensed translator library for raster and vector geospatial data formats.
https://gdal.org
Other
4.9k stars 2.55k forks source link

ERROR 1: fieldDesc[13].nIdx < 0 not expected #9644

Closed smnorris closed 6 months ago

smnorris commented 7 months ago

What is the bug?

When loading a GeoParquet file to Postgres with ogr2ogr, the error noted in title is emitted. The translation seems to work fine - the output feature count in the target db matches the source file and the schema is as expected.

https://github.com/OSGeo/gdal/blob/801b1c8e737f624cd17c23c728ad9cf624b758dd/ogr/ogrsf_frmts/arrow_common/ograrrowlayer.hpp#L4371

Steps to reproduce the issue

$ ogr2ogr -f PostgreSQL "PG:$DATABASE_URL" -nln whse_basemapping.transport_line -overwrite /vsis3/bcfishpass/whse_basemapping.transport_line.parquet whse_basemapping.transport_line --debug ON
HTTP: libcurl/8.4.0 (SecureTransport) LibreSSL/3.3.6 zlib/1.2.12 nghttp2/1.58.0
S3: Downloading 0-16383 (https://bcfishpass.s3.amazonaws.com/whse_basemapping.transport_line.parquet)...
S3: Got response_code=206
S3: Downloading 655458304-655460280 (https://bcfishpass.s3.amazonaws.com/whse_basemapping.transport_line.parquet)...
S3: Got response_code=206
S3: Downloading 655392768-655458303 (https://bcfishpass.s3.amazonaws.com/whse_basemapping.transport_line.parquet)...
S3: Got response_code=206
S3: Downloading 655360000-655392767 (https://bcfishpass.s3.amazonaws.com/whse_basemapping.transport_line.parquet)...
S3: Got response_code=206
PARQUET: geo = {"primary_column": "geometry", "columns": {"geometry": {"encoding": "WKB", "crs": {"$schema": "https://proj.org/schemas/v0.7/projjson.schema.json", "type": "ProjectedCRS", "name": "NAD83 / BC Albers", "base_crs": {"name": "NAD83", "datum": {"type": "GeodeticReferenceFrame", "name": "North American Datum 1983", "ellipsoid": {"name": "GRS 1980", "semi_major_axis": 6378137, "inverse_flattening": 298.257222101}}, "coordinate_system": {"subtype": "ellipsoidal", "axis": [{"name": "Geodetic latitude", "abbreviation": "Lat", "direction": "north", "unit": "degree"}, {"name": "Geodetic longitude", "abbreviation": "Lon", "direction": "east", "unit": "degree"}]}, "id": {"authority": "EPSG", "code": 4269}}, "conversion": {"name": "British Columbia Albers", "method": {"name": "Albers Equal Area", "id": {"authority": "EPSG", "code": 9822}}, "parameters": [{"name": "Latitude of false origin", "value": 45, "unit": "degree", "id": {"authority": "EPSG", "code": 8821}}, {"name": "Longitude of false origin", "value": -126, "unit": "degree", "id": {"authority": "EPSG", "code": 8822}}, {"name": "Latitude of 1st standard parallel", "value": 50, "unit": "degree", "id": {"authority": "EPSG", "code": 8823}}, {"name": "Latitude of 2nd standard parallel", "value": 58.5, "unit": "degree", "id": {"authority": "EPSG", "code": 8824}}, {"name": "Easting at false origin", "value": 1000000, "unit": "metre", "id": {"authority": "EPSG", "code": 8826}}, {"name": "Northing at false origin", "value": 0, "unit": "metre", "id": {"authority": "EPSG", "code": 8827}}]}, "coordinate_system": {"subtype": "Cartesian", "axis": [{"name": "Easting", "abbreviation": "E", "direction": "east", "unit": "metre"}, {"name": "Northing", "abbreviation": "N", "direction": "north", "unit": "metre"}]}, "scope": "Province-wide spatial data management.", "area": "Canada - British Columbia.", "bbox": {"south_latitude": 48.25, "west_longitude": -139.04, "north_latitude": 60.01, "east_longitude": -114.08}, "id": {"authority": "EPSG", "code": 3005}}, "geometry_types": ["LineString", "MultiLineString"], "bbox": [336279.1480168255, 369185.7570022164, 1874557.6820168253, 1717288.8370022164]}}, "version": "1.0.0", "creator": {"library": "geopandas", "version": "0.14.3"}}
PARQUET: Compression (of first column): snappy
GDAL: GDALOpen(/vsis3/bcfishpass/whse_basemapping.transport_line.parquet, this=0x13c737360) succeeds as Parquet.
PG: Client encoding: 'UTF8'
PG: PostGIS schema: 'public'
PG: Modifying search_path from "$user", public to '',"$user", public
PG: PostgreSQL version string : 'PostgreSQL 14.10 (Homebrew) on aarch64-apple-darwin23.0.0, compiled by Apple clang version 15.0.0 (clang-1500.0.40.1), 64-bit'
PG: PostGIS version string : '3.4 USE_GEOS=1 USE_PROJ=1 USE_STATS=1'
GDAL: GDALOpen(PG:postgresql://postgres@localhost:5432/postgis, this=0x13d023200) succeeds as PostgreSQL.
PG: Could not retrieve table oid for transport_line
PG: Could not retrieve table oid for transport_line
ERROR 1: fieldDesc[13].nIdx < 0 not expected
PG: LaunderName('OBJECTID') -> 'objectid'

etc

Versions and provenance

GDAL is from Homebrew.

$ ogrinfo --version
GDAL 3.8.4, released 2024/02/08

GeoParquet file was created with Geopandas v.0.14.3 using the to_parquet function, run on a Docker image based on ghcr.io/osgeo/gdal:ubuntu-full-3.8.4.

Additional context

No response

rouault commented 7 months ago

The error message was caused by Parquet column of type NULL. I don''t want to know why such horrible stuff is permitted... Anyway will be fixed per #9647. You may also add --config OGR2OGR_USE_ARROW_API NO to avoid the error message