pramsey / pgsql-ogr-fdw

PostgreSQL foreign data wrapper for OGR
MIT License
237 stars 34 forks source link

unable to connect to data source when querying information_schema.columns is_updateable column #163

Closed robe2 closed 5 years ago

robe2 commented 5 years ago

This is not an issue with all GDAL drivers, but I ran into the problem with excel spreadsheets and also see the Querying.gdb (in your Querying.zip file) has the same issue.

Steps to reproduce:

CREATE SERVER svr_gdb
    FOREIGN DATA WRAPPER ogr_fdw
    OPTIONS (datasource '/projects/sources/pgsql_ogr_fdw/data/Querying.gdb', format 'OpenFileGDB');

IMPORT FOREIGN SCHEMA ogr_all FROM SERVER svr_gdb INTO public;;

SELECT * FROM information_schema.columns WHERE table_schema = 'public';

Yields error:

unable to connect to data source "/projects/sources/pgsql_ogr_fdw/data/Querying.gdb"

However if you do:

SELECT column_name FROM information_schema.columns WHERE table_schema = 'public';

That works fine. I isolated the issue to the last column in informaton_schema.columns called is_updateable, which calls the function:

pg_catalog.pg_column_is_updatable(regclass, smallint, boolean)

This issue doesn't seem to occur for all drivers. For example when I use the ESRI Shapefile one, I don't get this issue.

robe2 commented 5 years ago

Forgot to mention I tested this on GDAL 2.4.0 and GDAL 2.2.4, latest release version of ogr_fdw.

robe2 commented 5 years ago

Okay I tried to work around the problem by setting updateable to false

DROP SERVER svr_gdb CASCADE;
CREATE SERVER svr_gdb
    FOREIGN DATA WRAPPER ogr_fdw
    OPTIONS (datasource '/projects/sources/pgsql_ogr_fdw/data/Querying.gdb', format 'OpenFileGDB', updateable 'false');

IMPORT FOREIGN SCHEMA ogr_all FROM SERVER svr_gdb INTO public;

SELECT  * FROM information_schema.columns WHERE table_schema = 'public';

Gives error:

ERROR: updates are not allowed on foreign server 'svr_gdb' HINT: ALTER FOREIGN SERVER svr_gdb OPTIONS (SET updatable 'true') SQL state: HV000

I forgot to mention I am testing on PostgreSQL 11.2 if that makes a difference.

robe2 commented 5 years ago

Other interesting oddities, if I try to set updateable to true, I immediately get the error:

DROP SERVER svr_gdb CASCADE;
CREATE SERVER svr_gdb
    FOREIGN DATA WRAPPER ogr_fdw
    OPTIONS (datasource '/projects/sources/pgsql_ogr_fdw/data/Querying.gdb', format 'OpenFileGDB', updateable 'true');

ERROR: unable to connect to data source "/projects/sources/pgsql_ogr_fdw/data/Querying.gdb"

robe2 commented 5 years ago

Here is the offending query reduced to it's most basic:

CREATE SERVER svr_gdb
    FOREIGN DATA WRAPPER ogr_fdw
    OPTIONS (datasource '/projects/sources/pgsql_ogr_fdw/data/Querying.gdb', format 'OpenFileGDB', updateable 'false');

IMPORT FOREIGN SCHEMA ogr_all FROM SERVER svr_gdb INTO public;;

 SELECT pg_catalog.pg_column_is_updatable('public.cities'::regclass, 1::smallint, false)  AS is_updatable;

If updateable is set to false, get:

ERROR: updates are not allowed on foreign server 'svr_gdb' HINT: ALTER FOREIGN SERVER svr_gdb OPTIONS (SET updatable 'true') SQL state: HV000

As mentioned with the updateable not set or set to true, it gives the:

unable to connect to data source "/projects/sources/pgsql_ogr_fdw/data/Querying.gdb"

bbarany commented 5 years ago

A similar problem:

This fairly standard metadata query is used by PostgREST at startup.

SELECT
    n.nspname AS table_schema,
    c.relname AS table_name,
    NULL AS table_description,
    c.relkind = 'r' OR (c.relkind IN ('v','f'))
        AND (pg_relation_is_updatable(c.oid::regclass, FALSE) & 8) = 8
    AS insertable
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind IN ('v','r','m','f')
  AND n.nspname NOT IN ('pg_catalog', 'information_schema')
GROUP BY table_schema, table_name, insertable
ORDER BY table_schema, table_name 

It throws an error, so PostgREST fails to start if any ogr_fdw foreign table exists in the current database. Setting the tables to upgradeable 'false' only throws a different error.

The query works without the _pg_relation_isupdatable() function call.

pramsey commented 5 years ago

I'm feeling a little painted into the corner by OGR here. @rouault maybe can help...

If you open a read/write data source with GDAL_OF_UPDATE you can use OGR_L_TestCapability with OLCRandomWrite and friends to see if it is writeable. If you open that same data source with GDAL_OF_READONLY then calls to OGR_L_TestCapability will return false for writing. Which kind of makes sense.

Except, if you open a read-only data source, like an "OpenFileGDB" source, with GDAL_OF_UPDATE then the open will fail, and that's the game there. In particular, it'll call the error handler with a CE_Failure or CE_Fatal state, which in my error handler causes a PgSQL ERROR to raise, and all processing to stop. So I cannot even do some ugly fall-back logic to back off a failed attempt to open a read-only driver with the GDAL_OF_UPDATE flag set.

Failing in this way is probably OK for operations like INSERT/UPDATE/DELETE: they aren't going to work to where precisely they fail in the flow of things isn't so important. But unfortunately the test of "is this thing writeable at all" which is happening here requires a conditional attempt to open with GDAL_OF_UPDATE.

Failing some magic solution from @rouault that I'm not seeing this will involve some kind of major ugliness and refactoring of how I'm opening OGR connections, which as you can imagine is rather central to this whole extension. :(

rouault commented 5 years ago

So I cannot even do some ugly fall-back logic to back off a failed attempt to open a read-only driver with the GDAL_OF_UPDATE flag set.

If you open in update mode a dataset that can only be supported in read-only mode, GDALOpenEx() will silently return a NULL pointer and not emit an error. You can just retry in read-only mode then.

Demo (in Python for conveniency, but behaviour will be identical with C API

$ OGR_SKIP=FileGDB python
>>> from osgeo import gdal
>>> ds = gdal.OpenEx("poly.gdb", gdal.OF_VECTOR | gdal.OF_UPDATE)
>>> print(ds)
None
>>> ds = gdal.OpenEx("poly.gdb", gdal.OF_VECTOR)
>>> print(ds)
<osgeo.gdal.Dataset; proxy of <Swig Object of type 'GDALDatasetShadow *' at 0x7f52b1878c90> >
pramsey commented 5 years ago

I'm seeing a trip to the error handler when I make that call.

Target 0: (postgres) stopped.
(lldb) p err_no
(int) $0 = 4
(lldb) p eErrClass
(CPLErr) $1 = CE_Failure
(lldb) 

So a CE_Failure is not a stopping condition?

pramsey commented 5 years ago
  * frame #0: 0x000000010da925b2 ogr_fdw.so`ogrErrorHandler(eErrClass=CE_Failure, err_no=4, msg="/tmp/Querying.gdb: Is a directory") at ogr_fdw.c:191:42
    frame #1: 0x000000010edf44c0 libgdal.20.dylib`ApplyErrorHandler(psCtx=0x00007f9f41510730, eErrClass=CE_Failure, err_no=4, pszMessage="/tmp/Querying.gdb: Is a directory") at cpl_error.cpp:259:17
    frame #2: 0x000000010edf3d1f libgdal.20.dylib`::CPLErrorV(eErrClass=CE_Failure, err_no=4, fmt="%s", args=0x00007ffee3256020) at cpl_error.cpp:464:5
    frame #3: 0x000000010edf35a4 libgdal.20.dylib`::CPLError(eErrClass=CE_Failure, err_no=4, fmt="%s") at cpl_error.cpp:316:5
    frame #4: 0x000000010ee41bb8 libgdal.20.dylib`::VSIToCPLError(eErrClass=CE_Failure, eDefaultErrorNo=4) at cpl_vsi_error.cpp:280:13
    frame #5: 0x000000010e5949b4 libgdal.20.dylib`::GDALOpenEx(pszFilename="/tmp/Querying.gdb", nOpenFlags=69, papszAllowedDrivers=0x00007f9f41601600, papszOpenOptions=0x0000000000000000, papszSiblingFiles=0x0000000000000000) at gdaldataset.cpp:3445:13
    frame #6: 0x000000010da946c3 ogr_fdw.so`ogrGetDataSource(source="/tmp/Querying.gdb", driver="OpenFileGDB", updateable=true, config_options=0x0000000000000000, open_options=0x0000000000000000) at ogr_fdw.c:354:13
    frame #7: 0x000000010da95064 ogr_fdw.so`ogrGetConnectionFromServer(foreignserverid=154354, updateable=true) at ogr_fdw.c:500:11
    frame #8: 0x000000010da94a70 ogr_fdw.so`ogrGetConnectionFromTable(foreigntableid=155351, updateable=true) at ogr_fdw.c:525:8
    frame #9: 0x000000010da9380b ogr_fdw.so`ogrIsForeignRelUpdatable(rel=0x000000010da03d88) at ogr_fdw.c:2542:8
rouault commented 5 years ago

Hum, nOpenFlags=69 = 1 + 4 + 64 = GDAL_OF_UPDATE + GDAL_OF_VECTOR + GDAL_OF_VERBOSE_ERROR The CPLError(CE_Failure, ...) is emitted because of GDAL_OF_VERBOSE_ERROR (see https://github.com/OSGeo/gdal/blob/master/gdal/gcore/gdaldataset.cpp#L3444), but I can't see where in the codebase( https://github.com/pramsey/pgsql-ogr-fdw/search?q=GDAL_OF_VERBOSE_ERROR&unscoped_q=GDAL_OF_VERBOSE_ERROR ) it would be used. Maybe you added it in your working tree ?

So a CE_Failure is not a stopping condition?

Depends... sometimes they can be recovered. It might happen in very rare cases (partially corrupted datasets), that GDALOpenEx() returns a non-NULL handle, but that along the road, deep in the guts of drivers, a CPLError(CE_Failure, ...) was emitted. What you could do is wrap the call to GDALOpenEx() with CPLPushErorrHandler(CPLQuietErrorHandler) / CPLPopErrorHandler() to avoid your global error handler to catch such errors. But for the plain attempt to open in update mode, when the driver cannot, normally just using GDAL_OF_VECTOR | GDAL_OF_UPDATE should not emit anything.

(side note: CE_Fatal crashes the process, and I think we have eliminated all of them from the code base, years ago)

pramsey commented 5 years ago

OK, yes, I had put in GDAL_OF_VERBOSE_ERROR to my working tree early on to try and get some extra visibility about what was going on. So, I can see that there is at least a way out of this mess, it'll still be ugly but it's doable.

bbarany commented 5 years ago

Should setting the foreign server as updateable = 'false' avoid the situation where querying updateability throws an error? Because that's what happens in my case. I imported a foreign schema from a WFS server, obviously read only. Metadata queries fail without updateable being set, and both with it set to true and false.

pramsey commented 5 years ago

Maybe it should but as it stands it doesn't, and fixing that is just one case in the overall schmozzle ;)

pramsey commented 5 years ago

I haven't tested all the potential combinations of readonly / readable / updateable yet, but master as of 5bad13a is closer to what you want, I think.

pramsey commented 5 years ago

Seems to work for a few use cases, and for regression.

bbarany commented 5 years ago

Both example queries mentioned here work if the foreign server is set to updateable = 'false'.

Thank you!

lixfelix123 commented 4 years ago

Did u solve this sir? can you help me with this code sir? thanks..

CREATE SERVER access_server FOREIGN DATA WRAPPER ogr_fdw OPTIONS ( datasource 'C:\dbPIB.mdb', format 'ODBC' );

ERROR: GDAL AppDefined [1] GDALOpen() called on C:\dbPIB.mdb recursively SQL state: XX000

I'm really confused