pramsey / pgsql-ogr-fdw

PostgreSQL foreign data wrapper for OGR
MIT License
237 stars 34 forks source link

Functionality to transform geometry to specified EPSG code (-t_srs in ogr2ogr) #131

Closed apjoseph closed 7 years ago

apjoseph commented 7 years ago

Recently, it was made aware to me that it is possible to transform geometries using GDAL without knowing the input EPSG code that corresponds to a .prj file, but by instead using the text of the projection itself (which should have occurred to me earlier). Given that GDAL's EPSG detection mechanism is broken beyond any hope of repair, and that having data properly projected is fundamental to basically any usage in Postgis, I am wondering if it would be possible to implement functionality similar to -t_srs in ogr2ogr with syntax like:

IMPORT FOREIGN SCHEMA ogr_all FROM server INTO data OPTIONS (t_srs 'epsg:4326')

and which is essentially equivalent to:

ogr2ogr -f postgis -s_srs myshp.prj -t_srs epsg:4326...

Such functionality would allow users of ogr_fdw to avoid needing to copy and paste .prj text files into the prj2epsg website -or designing a script that accomplishes the same thing -for every feature class, and then altering each table with the correct srid. If you are loading a directory of several heterogeneously projected shapefiles, such functionality would be extremely convenient and save hours of life (especially if you guess that they all have the same projection as the first file you see and are wrong)!

I asked a question on gis.stackexchange about whether there was some other functionality that could be leveraged to avoid SRID Hell with ogr_fdw (and with gdal in general), but as far as I can tell there is no solution that doesn't involve creating a re projected dataset for each file -unless a change is made to ogr_fdw itself.

pramsey commented 7 years ago

Rather than aping ogr2ogr's syntax (which is abysmal) an option like (srid 4326) would make sense, and (probably?) would be a simple switch to avoid srid autodetection in favor of a forced srid. Involves another FDW option, of course, which is yuck.

apjoseph commented 7 years ago

That makes sense to me -I'm not going to defend ogr2ogr syntax. Alternatively to avoid adding the option, perhaps simplify forcing 4326 on all shapefiles (and other similarly problematic drivers) by default with no fdw option could work. In my experience GDAL gets the SRID wrong in > 90% of shapefiles -so you can't really trust you have the correct source SRID without manually checking it.

pramsey commented 7 years ago

Actually, no, I'm going to close this one out: you can force an SRID on your data by declaring the type of the geometry column with appropriate typemod: geometry(point, 4326) for example. That will ensure that all geometry coming out of OGR is flagged as 4326 before you get to see it, so you can apply ST_Transform to it on the PostGIS side.

apjoseph commented 7 years ago

I understand. The issue came up for me because I was trying to circumvent OGR's inability to take an arbitrary standard WKT projection string and return correct the EPSG code. This can be an issue when you are eager to start working with a new directory of shapefiles -and you forget (or are unaware) that OGR generates the wrong SRID with IMPORT FOREIGN SCHEMA by default unless you specifically overwrite it. When this happens, all of the functions of Postgis still work -and you usually don't realize something went wrong until you see that your final calcs/table is completely incorrect. I suppose it would be better to file a bug with GDAL/OGR -since broken wkt-EPSG conversion is a fairly major problem for programatically importing/transforming data -but there seems to be complications with regard to fixing it.

I solved my issue in a somewhat hacky way by just writing a script leveraging geotools (which DOES manage to get the correct EPSG code for a WKT string) the GDAL Java Bindings. Using those libraries together, you can introspect a datasource and auto generate the correct foreign table definition in the database via JDBC.