tldr; 100% of TNRIS Lidar DEM tiles with GDAL-incompatible metadata were successfully "corrected" so these tiles could be included in scalable pre-processing with GDAL. With the corrections, 100% of the tiles successfully ran against common GDAL routines.
In order to prepare source terrain imagery tiles for use at scale, terrain_aggregator gathers all desired terrain tiles into a central PostgreSQL database and records basic but necessary metadata from each tile. Current TNRIS best practices require that DEM tile metadata is FGDC-compliant but does not require this metadata to be produced in way that supports essential DEM processing libraries such as GDAL. At least 10% of TNRIS's ~350,000 DEM tiles cannot by default be used with GDAL in particular, usually:
GDAL-incompatible # 1 because GDAL cannot detect included projection information
π impacts ~10% or ~35,000 tiles
GDAL-incompatible # 2 because the original tiles had incorrectly stated projection information
π impacts ~1% or ~4,000 tiles
GDAL-incompatible # 3 because the original tile is corrupted
π impacts ~0.001% or exactly 3 tiles
GDAL-incompatible # 1 usually occurs in newer TNRIS Lidar DEM tilesets, because more highly detailed projection information is provided, recording the provenance of the projection using a BOUNDCRS WKT2 key. Common GDAL operations do not yet support the BOUNDCRS WKT2 key, and so these tiles cannot be processed at scale using GDAL except by explicitly naming the correct projection code. terrain_aggregator stores the "corrected" projection code for these tiles as an attribute to these tiles in a PostgreSQL database to enable bulk processing to include these tiles.
[ ] This is a candidate for automation when looking towards
including future TNRIS Lidar DEM tilesets
replicating this work
expanding to other states' Lidar DEM tilesets
A handful of tiles are impacted by GDAL-incompatible # 1 because no projection information has been included whatsoever. Currently, these tiles or tilesets containing these tiles require manual intervention in order to determine and assign the correct projection code.
[ ] Since the number of likely projections is small, a guess-and-check projection finding routine could be implemented to simplify this.
GDAL-incompatible # 2 usually occurs for some older tiles and tilesets. In the vast majority of these cases, these tiles are labelled with an adjacent UTM zone to what they actually represent. Currently these tiles require manual intervention to correct their projections.
[ ] However, this could also be automated for the vast majority of the tiles, by applying a guess-and-check projection finding routine.
GDAL-incompatible # 3 refers to a few tiles with integer pixel data type and palette color interpretation. Some common GDAL routines will break if either the data type or the color interpretation is not consistent throughout. Reference to these tiles is maintained in the terrain_aggregator PostgreSQL DB, but these tiles are dropped from any further processing.
Beyond the fact that these tiles having highly suspect elevation data, we can safely drop these tiles:
π€ because their palleting guarantees at least 1m vertical inaccuracy and
π€ because there exist alternate statewide seamless terrain datasets with at least 1m vertical inaccuracy,
π so these tiles can be safely disregarded in favor of the alternate terrain data available.
TNRIS high resolution terrain database details
terrain_aggregator
provides a back-to-front approach to aggregating and serving source Lidar DEM tiles from a high-performance computing environment.Context
Processing terrain data at scale requires relying on
GDAL
,TauDEM
, andGeoFlood
tools.DB preparation
tldr; 100% of TNRIS Lidar DEM tiles with GDAL-incompatible metadata were successfully "corrected" so these tiles could be included in scalable pre-processing with GDAL. With the corrections, 100% of the tiles successfully ran against common GDAL routines.
In order to prepare source terrain imagery tiles for use at scale,
terrain_aggregator
gathers all desired terrain tiles into a centralPostgreSQL
database and records basic but necessary metadata from each tile. Current TNRIS best practices require that DEM tile metadata is FGDC-compliant but does not require this metadata to be produced in way that supports essential DEM processing libraries such asGDAL
. At least 10% of TNRIS's ~350,000 DEM tiles cannot by default be used withGDAL
in particular, usually:GDAL-incompatible # 1
becauseGDAL
cannot detect included projection information π impacts ~10% or ~35,000 tilesGDAL-incompatible # 2
because the original tiles had incorrectly stated projection information π impacts ~1% or ~4,000 tilesGDAL-incompatible # 3
because the original tile is corrupted π impacts ~0.001% or exactly 3 tilesGDAL-incompatible # 1
usually occurs in newer TNRIS Lidar DEM tilesets, because more highly detailed projection information is provided, recording the provenance of the projection using aBOUNDCRS
WKT2 key. CommonGDAL
operations do not yet support theBOUNDCRS
WKT2 key, and so these tiles cannot be processed at scale usingGDAL
except by explicitly naming the correct projection code.terrain_aggregator
stores the "corrected" projection code for these tiles as an attribute to these tiles in a PostgreSQL database to enable bulk processing to include these tiles.A handful of tiles are impacted by
GDAL-incompatible # 1
because no projection information has been included whatsoever. Currently, these tiles or tilesets containing these tiles require manual intervention in order to determine and assign the correct projection code.GDAL-incompatible # 2
usually occurs for some older tiles and tilesets. In the vast majority of these cases, these tiles are labelled with an adjacent UTM zone to what they actually represent. Currently these tiles require manual intervention to correct their projections.GDAL-incompatible # 3
refers to a few tiles with integer pixel data type and palette color interpretation. Some commonGDAL
routines will break if either the data type or the color interpretation is not consistent throughout. Reference to these tiles is maintained in theterrain_aggregator
PostgreSQL DB, but these tiles are dropped from any further processing. Beyond the fact that these tiles having highly suspect elevation data, we can safely drop these tiles: