pramsey / pgsql-ogr-fdw

PostgreSQL foreign data wrapper for OGR
MIT License
237 stars 34 forks source link

Unrecognized node type error on select #177

Closed apjoseph closed 3 years ago

apjoseph commented 4 years ago

Unrecognized node type error is thrown when a simple = operator is passed down to gdal. If you force the query not to pushdown by using coalesce(field,null), the operation works fine.

This can be verified by a simple two column excel table:

id name
test_id1 test_name1
test_id2 test_name2

And running following sql:

DROP SCHEMA IF EXISTS fdw_test CASCADE;
DROP SERVER IF EXISTS  fdw_test CASCADE;
CREATE SERVER fdw_test
  FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (
    datasource '/tmp/test_ogr_fdw.xlsx',
    format 'XLSX',
    config_options 'OGR_XLSX_HEADERS=FORCE CPL_DEBUG=ON'
  );
CREATE SCHEMA fdw_test;
IMPORT FOREIGN SCHEMA ogr_all FROM SERVER fdw_test
INTO fdw_test;
SELECT * FROM fdw_test.test WHERE id = 'test_id1'

Yields

[2019-11-22 14:53:18] [XX000] ERROR: unrecognized node type: 543451432
[2019-11-22 14:53:18] [00000] parse <unnamed>: SELECT * FROM fdw_test.test WHERE id = 'test_id1'
[2019-11-22 14:53:18] [00000] bind C_86 to <unnamed>
[2019-11-22 14:53:18] [00000] GDAL config option 'OGR_XLSX_HEADERS' set to 'FORCE'
[2019-11-22 14:53:18] [00000] GDAL config option 'CPL_DEBUG' set to 'ON'
[2019-11-22 14:53:18] [00000] GDAL None [0] GDAL: GDALOpen(/tmp/test_ogr_fdw.xlsx, this=0x55e9a1a657a0) succeeds as XLSX.
[2019-11-22 14:53:18] [00000] GDAL None [0] XLSX: Init(test)
[2019-11-22 14:53:18] [00000] GDAL None [0] XLSX: test has header line
[2019-11-22 14:53:18] [00000] ogrReadColumnData matched 1 FID, 0 GEOM, 2 FIELDS out of 3 PGSQL COLUMNS
[2019-11-22 14:53:18] [00000] ogrOperatorIsSupported got operator '='
[2019-11-22 14:53:18] [00000] OGR SQL: (id = 'test_id1')
[2019-11-22 14:53:18] [00000] GDAL None [0] GDAL: GDALClose(/tmp/test_ogr_fdw.xlsx, this=0x55e9a1a657a0)

Postgis Full Version: POSTGIS="3.0.0 r17983" [EXTENSION] PGSQL="120" GEOS="3.7.3-CAPI-1.11.3 b50468f9" PROJ="6.2.1" LIBXML="2.9.4" LIBJSON="0.12.1" LIBPROTOBUF="1.3.1" WAGYU="0.4.3 (Internal)" TOPOLOGY

pramsey commented 4 years ago

Works for me (tm)... GDAL 2.6

fdw=# DROP SCHEMA IF EXISTS fdw_test CASCADE;
NOTICE:  schema "fdw_test" does not exist, skipping
DROP SCHEMA
fdw=# DROP SERVER IF EXISTS  fdw_test CASCADE;
NOTICE:  server "fdw_test" does not exist, skipping
DROP SERVER
fdw=# CREATE SERVER fdw_test
fdw-#   FOREIGN DATA WRAPPER ogr_fdw
fdw-#   OPTIONS (
fdw(#     datasource '/tmp/test_ogr_fdw.xlsx',
fdw(#     format 'XLSX',
fdw(#     config_options 'OGR_XLSX_HEADERS=FORCE CPL_DEBUG=ON'
fdw(#   );
CREATE SERVER
fdw=# CREATE SCHEMA fdw_test;
CREATE SCHEMA
fdw=# IMPORT FOREIGN SCHEMA ogr_all FROM SERVER fdw_test
fdw-# INTO fdw_test;
NOTICE:  Number of tables to be created 1
IMPORT FOREIGN SCHEMA
fdw=# \d fdw_test.sheet1 
                      Foreign table "fdw_test.sheet1"
 Column |       Type        | Collation | Nullable | Default | FDW options 
--------+-------------------+-----------+----------+---------+-------------
 fid    | bigint            |           |          |         | 
 id     | character varying |           |          |         | 
 name   | character varying |           |          |         | 
Server: fdw_test
FDW options: (layer 'Sheet1')

fdw=# select * from fdw_test.sheet1 where id = 'test_id1';
 fid |    id    |    name    
-----+----------+------------
   2 | test_id1 | test_name1
(1 row)
pramsey commented 4 years ago

One interesting thing about your log is that the node error you're seeing is happening before any of the GDAL stuff even kicks in. Here's my log for the same query:

fdw=# set client_min_messages=debug3;
SET
fdw=# select * from fdw_test.sheet1 where id = 'test_id1';
DEBUG:  GDAL config option 'OGR_XLSX_HEADERS' set to 'FORCE'
DEBUG:  GDAL config option 'CPL_DEBUG' set to 'ON'
DEBUG:  GDAL None [0] GDAL: GDALOpen(/tmp/test_ogr_fdw.xlsx, this=0x7f90d7702860) succeeds as XLSX.
DEBUG:  GDAL None [0] XLSX: Init(Sheet1)
DEBUG:  GDAL None [0] XLSX: Sheet1 has header line
DEBUG:  ogrReadColumnData matched 1 FID, 0 GEOM, 2 FIELDS out of 3 PGSQL COLUMNS
DEBUG:  ogrOperatorIsSupported got operator '='
DEBUG:  OGR SQL: (id = 'test_id1')
DEBUG:  GDAL None [0] GDAL: GDALClose(/tmp/test_ogr_fdw.xlsx, this=0x7f90d7702860)
DEBUG:  GDAL config option 'OGR_XLSX_HEADERS' set to 'FORCE'
DEBUG:  GDAL config option 'CPL_DEBUG' set to 'ON'
DEBUG:  GDAL None [0] GDAL: GDALOpen(/tmp/test_ogr_fdw.xlsx, this=0x7f90d7702860) succeeds as XLSX.
DEBUG:  GDAL None [0] XLSX: Init(Sheet1)
DEBUG:  GDAL None [0] XLSX: Sheet1 has header line
DEBUG:  ogrReadColumnData matched 1 FID, 0 GEOM, 2 FIELDS out of 3 PGSQL COLUMNS
DEBUG:  OGR FDW processed 1 rows from OGR
DEBUG:  GDAL None [0] Mem: 1 features read on layer 'Sheet1'.
DEBUG:  GDAL None [0] GDAL: GDALClose(/tmp/test_ogr_fdw.xlsx, this=0x7f90d7702860)
apjoseph commented 4 years ago

I'm using gdal 3.0.2. I tested downgrading ogr_fdw to 1.0.9 and it went through fine. The bug is only triggered when the restriction is pushed down to gdal.

pramsey commented 4 years ago

Your own build? Platform?

pramsey commented 4 years ago

Note, my clause was pushed down too:

OGR SQL: (id = 'test_id1')
apjoseph commented 4 years ago

I built gdal/postgis/geos/proj from source. Postgres was installed using standard docker base image for 12.1 which is based on debian buster (slim). I purged the build cache and rebuilt a few times last week and the issue recurred. Perhaps it was a temporary issue with postgres packages. I'll build again now and see if the issue is still there.

apjoseph commented 4 years ago

Ok just rebuilt this again and the issue is still showing up unfortunately.

apjoseph commented 4 years ago

@pramsey when you tested this, what version of Postgres were you using? I tested with GDAL 2.4.3 and still had the same issue.

pramsey commented 4 years ago

I'm usually against pg12 on my dev box.

robe2 commented 3 years ago

FWIW it works for me. As an extra data point, it, testing with ogr_fdw 1.0.12 on PostgreSQL 13 windows with GDAL 3.2.0 built

POSTGIS="3.0.2 3.0.2" [EXTENSION] PGSQL="130" GEOS="3.8.1-CAPI-1.13.3" SFCGAL="1.3.8" PROJ="6.3.2" GDAL="GDAL 3.2.0, released 2020/10/26 GDAL_DATA not found" LIBXML="2.9.9" LIBJSON="0.12" LIBPROTOBUF="1.2.1" WAGYU="0.4.3 (Internal)" TOPOLOGY RASTER PostgreSQL 13.0, compiled by Visual C++ build 1914, 64-bit

with expat 2.2.9 (also tried with expat 2.2.10 (that is what the XLSX driver uses I believe) and worked fine there too). My output looks similar to @pramsey .

@apjoseph could you by chance provide the exact test excel file you are using, to rule out Excel version differences. I tried creating the file on an old version of excel 2013 and Office 365. The file sizes are slightly different (on older it clocked in at 7.47 kb and on newer (8.48kb) but both worked.. I've had crazy issues with XLSX files where things would get scrambled and then copying and pasting to a new sheet or saving down to .XLS made it work fine with the ogr driver. Admittedly they are different drivers (XLSX/XLS) but I feel the XLS ogr driver is much more robust than the XLSX one or maybe it's just XLSX is a moving target.

DEBUG:  GDAL config option 'OGR_XLSX_HEADERS' set to 'FORCE'
DEBUG:  GDAL config option 'CPL_DEBUG' set to 'ON'
DEBUG:  GDAL None [0] GDAL: GDALOpen(c:/fdw_data/test_ogr_fdw.xlsx, this=0000000001A25B30) succeeds as XLSX.
DEBUG:  GDAL None [0] XLSX: Init(Sheet1)
DEBUG:  GDAL None [0] XLSX: Sheet1 has header line
DEBUG:  ogrReadColumnData matched 1 FID, 0 GEOM, 2 FIELDS out of 3 PGSQL COLUMNS
DEBUG:  ogrOperatorIsSupported got operator '='
DEBUG:  OGR SQL: (id = 'test_id1')
DEBUG:  GDAL None [0] GDAL: GDALClose(c:/fdw_data/test_ogr_fdw.xlsx, this=0000000001A25B30)
DEBUG:  GDAL config option 'OGR_XLSX_HEADERS' set to 'FORCE'
DEBUG:  GDAL config option 'CPL_DEBUG' set to 'ON'
DEBUG:  GDAL None [0] GDAL: GDALOpen(c:/fdw_data/test_ogr_fdw.xlsx, this=0000000001A25B30) succeeds as XLSX.
DEBUG:  GDAL None [0] XLSX: Init(Sheet1)
DEBUG:  GDAL None [0] XLSX: Sheet1 has header line
DEBUG:  ogrReadColumnData matched 1 FID, 0 GEOM, 2 FIELDS out of 3 PGSQL COLUMNS
DEBUG:  OGR FDW processed 1 rows from OGR
DEBUG:  GDAL None [0] Mem: 1 features read on layer 'Sheet1'.
DEBUG:  GDAL None [0] GDAL: GDALClose(c:/fdw_data/test_ogr_fdw.xlsx, this=0000000001A25B30)