qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS
118 stars 37 forks source link

FileGeodatabase raster read support (QGIS Grant 2020 program) #173

Open rouault opened 4 years ago

rouault commented 4 years ago

FileGeodatabase raster read support

Date 2020/05/12

Author Even Rouault (@rouault)

Contact even dot rouault at spatialys dot com

maintainer @rouault

Version GDAL 3.2 / QGIS 3.18

Summary

Outside of the open-source geospatial realm, ESRI software is the dominant vendor in the geospatial industry. Regardless of how open-source friendly a particular organisation or user is, the reality is that they will need to interact with ESRI data formats on a regular basis. Many official government data portals provide spatial data only in ESRI formats, and some customers will only supply source data in these formats. It is critical to the success of open-source geospatial software that this software has stable and performant capabilities to read these proprietary ESRI formats and provide a means to convert this data into standard, open-source friendly formats.

Since all of QGIS' support for reading and writing disk based files is provided by the underlying GDAL library, it is natural that support for these proprietary ESRI formats be added or extended in GDAL itself. While QGIS will directly benefit from this work, investment in GDAL also directly benefits many other open-source geospatial tools, including GRASS GIS, PostGIS, R, rasterio / fiona, MapServer, etc.

This proposal covers creation of extending the existing OpenFileGDB driver, currently only handling vector layers, to be able to reader raster layers stored within an ESRI geodatabase (gdb) file. There is currently NO way to read these raster files within QGIS (or with any other open-source software widely available), which entirely prevents users from either viewing these datasets or converting them to alternative standard raster formats. This presents a critical road block to open-source adoption by users who have previously used ESRI software and have existing raster GDB files, OR by users who can only access official raster data in these formats. Access to all data formats from open-source software is a priority one requirement, and the inability to read these rasters formats from any tool presents a critical risk to the open-source geospatial community.

Proposed Solution

The GDB format is a proprietary, closed-source format, and no official specifications from the vendor have been released describing this file format. Accordingly, this work will consist of finalizing reverse-engineering efforts in understanding the file format, GDAL driver support for raster tables, and the changes required to expose this format to QGIS users.

The work will be built upon previous work done by myself (general reverse-engineering of FileGDB tables), further analysis done by James Ramm specifically on raster tables, and a proof-of-concept conversion utility done by Richard Barnes (https://github.com/r-barnes/ArcRasterRescue)

The developed driver will:

Any potentially needed improvements in QGIS browser, so that vector and raster layers of a FileGDB dataset are correctly listed, will also be done as part of this work.

Note: handling of compressed FileGDB datasets (.cdf), out of scope of the OpenFileGDB driver, will remain out of scope.

Backwards Compatibility

New functionality with no impact on backwards compatibility

Further Considerations/Improvements

Once that work would have been completed, this would open the possibility of adding raster write capabilities to the OpenFileGDB driver.

Issue Tracking ID(s)

None

Votes

(required)

nyalldawson commented 4 years ago

I'm very excited to see this progress -- Australian users are greatly affected by government portals which distribute rasters ONLY in this format.

Saijin-Naib commented 4 years ago

Echoing what I said in #172, this is yet another massive pain point removed when small/local government are considering a possible FOSS GIS stack.

I worked around it in my instance by using ArcGIS to convert all of our FGDB Rasters to COGs, but that was a very time and labor-intensive task that had a ton of ripple effects, impacting innumerable private maps and projects people have been accumulating over the years.

Having the ability to directly interface with existing FGDB Rasters means that a SLYR conversion of an ArcGIS Project is even closer to truly 1:1, with no pain in transition. It also means that we might be able to share these data in the FGDB container, if required by a contact/client. Huge win.

Saijin-Naib commented 4 years ago

I had been thinking about what our organization would want/need in GDAL FGDB Raster support, and more widely, what others would need for strong interoperability, and this is what I've come up with so far.

PRs welcome to edit it.

https://github.com/Saijin-Naib/QGIS-ESRI-Testfiles/tree/master/FileGeoDataBases/Raster

rouault commented 4 years ago

https://github.com/Saijin-Naib/QGIS-ESRI-Testfiles/tree/master/FileGeoDataBases/Raster

this is very helpful ! Could you possibly also add the corresponding exports to GeoTIFF of the sample files ? This proposal only covers FileGeodatabase, not Personal GeoDatabase

Saijin-Naib commented 4 years ago

Could you possibly also add the corresponding exports to GeoTIFF of the sample files ?

Sure. No idea why I forgot to include the "source" rasters.

This proposal only covers FileGeodatabase, not Personal GeoDatabase

That's okay, the above samples are FGDBs (File GeoDatabases), not personal MDBs. Unless I misunderstood you, and you're requesting MDBs as well, which I should be able to cook up.

rouault commented 4 years ago

Unless I misunderstood you, and you're requesting MDBs as well, which I should be able to cook up.

no, MDBs can be left apart. I meant they are out of scope of the intended scope of work of this QEP. (MDBs are a completely different beast, at least at the storage level, with their own weight of painfulness)

Saijin-Naib commented 4 years ago

Could you possibly also add the corresponding exports to GeoTIFF of the sample files ?

In the ReadMe.MD at that link, I link directly back to the sample data I pulled from the QGIS repo to use as the input GeoTIFFs to the File GeoDatabases.

Are you asking me to export out to a certain GeoTIFF profile the current data stored within the File GeoDatabases to see if they're different than the input data?

baswein commented 4 years ago

Forgive my intrusion but I want to express my excitement for this proposal. I see the mention of Raster Attribute Tables. I am doubly excited if this would this move QGIS towards getting those? https://github.com/qgis/QGIS/issues/22427 I am also excited because the NRCS Soil lab seems to be moving towards a raster based gridded file format that is in a .gdb that is not compatible with QGIS. https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/home/?cid=nrcs142p2_053628

nyalldawson commented 4 years ago

@baswein

I am also excited because the NRCS Soil lab seems to be moving towards a raster based gridded file format that is in a .gdb that is not compatible with QGIS.

Thanks for the link -- this is a prime example of the motivation behind this proposal. Without having an open source driver for this format we are at increasing risk of being pushed "out of the market", simply because users NEED to access data which is only available in these formats.

rouault commented 4 years ago

@baswein

I see the mention of Raster Attribute Tables

hum, not explicitly in this proposal. I haven't investigated yet if/how they are encoded in FileGDB. Depending on the level of effort needed, this might be implemented as part of this work (in the GDAL driver. the QGIS side would be another task), but no promise.

baswein commented 4 years ago

Thanks Even- I completely understand. One step at a time! In the case of the soils data it would be very helpful because the raster values are just a join key (MUKEY ) to connect to the actual relevant data contained in the other tables. Here's hoping that they encoded it like the other tables.

Saijin-Naib commented 4 years ago

@rouault , I've added the GeoTIFFs used as inputs for the various Raster FileGeoDatabases into the repository. Is this what you needed?

rouault commented 4 years ago

I've added the GeoTIFFs used as inputs for the various Raster FileGeoDatabases into the repository. Is this what you needed?

at first sight yes, thanks. I'll perhaps have more precise requests once the work starts

roya0045 commented 3 years ago

@rouault if this is applicable for the grant, do you intend to suggest it for the 2021 round?

Saijin-Naib commented 3 years ago

I'm still available to provide any and all test data that I can to assist.

roya0045 commented 3 years ago

@Saijin-Naib ther is already a standalone version, I would assume that Even could leverage the work done there rather than use raw data at this point.

rouault commented 3 years ago

@rouault if this is applicable for the grant, do you intend to suggest it for the 2021 round?

It is not applicable for the same reason as 2020: "This year, we will not accept proposals for the development of new features."

@roya0045 ther is already a standalone version,

what are you referring too ? ArcRasterRescue ? This is a starting point indeed, but it may not cover every cases. So input data will be interested if this work is done

roya0045 commented 3 years ago

@rouault Thanks Even, yes I was referring to ArcRasterRescue. I'm not sure how many cases there are with raster but I would guess that the work done should cover a good deal. The remainder could be integrated when issues arise.

The author seemed to want to transfer his work to GDAL but was not sure how to implement the driver in GDAL style. Maybe a crowdfund could be done to cover the transition cost.