This tool provides a raster of a Digital Elevation Model (DEM) over an area of interest utilizing global or continental, publicly available tile sets such as the Global Copernicus Digital Elevation Model at 30 meter resolution. See the Datasets section below for all the tiles supported and their shortnames. This tool also performs some standard transformations for processing such as:
Area
tag) or center of the pixel (Point
tag).We rely on the GIS formats from rasterio
. The API can be summarized as
from dem_stitcher import stitch_dem
# as xmin, ymin, xmax, ymax in epsg:4326
bounds = [-119.085, 33.402, -118.984, 35.435]
X, p = stitch_dem(bounds,
dem_name='glo_30', # Global Copernicus 30 meter resolution DEM
dst_ellipsoidal_height=False,
dst_area_or_point='Point')
# X is an m x n numpy array
# p is a dictionary (or a rasterio profile) including relevant GIS metadata; CRS is epsg:4326
Then, to save the DEM raster to disk:
import rasterio
with rasterio.open('dem.tif', 'w', **p) as ds:
ds.write(X, 1)
ds.update_tags(AREA_OR_POINT='Point')
The rasters are returned in the global lat/lon projection epsg:4326
and the API assumes that bounds are supplied in this format.
In order to easily manage dependencies, we recommend using dedicated project environments via Anaconda/Miniconda. or Python virtual environments.
dem_stitcher
can be installed into a conda environment with
conda install -c conda-forge dem_stitcher
or into a virtual environment with
python -m pip install dem_stitcher
Currently, python 3.9+ is supported.
Although the thrust of using this package is for staging DEMs for InSAR (particularly ISCE2), testing and maintaining suitable environments to use with InSAR processors is beyond the scope of what we are attempting to accomplish here. We provide an example notebook here that demonstrates how to stage a DEM for ISCE2, which requires additional packages than required for the package on its own. For the notebook, we use the environment found in environment.yml
of the Dockerized TopsApp repository, used to generate interferograms (GUNWs) in the cloud.
The creation metadata unrelated to georeferencing (e.g. the compress
key or various other options here) returned in the dictionary profile
from the stitch_dem
API is copied directly from the source tiles being used if they are GeoTiff formatted (such as glo_30
) else the creation metadata are copied from the GeoTiff Default Profile in rasterio
(see here excluding nodata
and dtype
). Such metadata creation options are beyond the scope of this library.
The accessing of NASADEM and SRTM require earthdata login credentials to be put into the ~/.netrc
file. If these are not present, the stitcher will
fail with ValueError
asking you to update the ~/.netrc
. The appropriate entry appears as:
machine urs.earthdata.nasa.gov
login <username>
password <password>
We have notebooks to demonstrate common usage:
conda-forge
We also demonstrate how the tiles used to organize the urls for the DEMs were generated for this tool were generated in this notebook.
The DEMs that are currently supported are:
In [1]: from dem_stitcher.datasets import DATASETS; DATASETS
Out[1]: ['srtm_v3', 'nasadem', 'glo_90_missing', 'glo_30', '3dep', 'glo_90']
The shortnames aboves are the strings required to use stitch_dem
. Below, we expound upon these DEM shortnames and link to their respective data repositories.
glo_30
/glo_90
: Copernicus GLO-30/GLO-90 DEM. The tile sets are the 30 and 90 meter resolution, respectively [link].3dep
: 3Dep 1/3 arc-second over North America - we are storing the ~10 meter resolution dataset. There are many more as noted here. The files for these DEMs are heresrtm_v3
: SRTM v3 [link]nasadem
: Nasadem [link]glo_90_missing
: these are tiles that are in glo_90
but not in glo_30
. They are over the countries Armenia and Azerbaijan. Used internally to help fill in gaps in coverage of glo_30
.
All the tiles are given in lat/lon CRS (i.e. epsg:4326
for global tiles or epsg:4269
for USGS tiles in North America). A notable omission to the tile sets is the Artic DEM here, which is suitable for DEMs merged at the north pole of the globe due to lat/lon distortion.
If there are issues with obtaining dem tiles from urls embedded within the geojson tiles (e.g. a 404
error as here), please see the Development section below and/or open an issue ticket.
Wherever possible, we do not resample the original DEMs unless specified by the user to do so. When extents are specified, we obtain the the minimum pixel extent within the merged tile DEMs that contain that extent. Any required resampling (e.g. updating the CRS or updating the resolution because the tiles have non-square resolution at high latitudes) is done after these required translations. We importantly note that order in which these transformations are done is crucial, i.e. first translating the DEM (to either pixel- or area-center coordinates) and then resampling is different than first resampling and then translation (as affine transformations are not commutative). Here are some notes/discussions:
epsg:4326
. Most DEMs are already in this CRS except the USGS DEMs over North America, which are in epsg:4269
, whose xy
projection is also lon/lat
but has different vertical data. For our purposes, these two CRSs are almost identical. The nuanced differences between these CRS's is noted here.Point
and Area
tags in gdal
, respectively, and seen as {'AREA_OR_POINT: 'Point'}
. Note that tying a pixel to the upper-left cortner (i.e. Area
tag) is the default pixel reference for gdal
as indicated here. Some helpful resources in reference to DEMs about this book-keeping are below.
{'AREA_OR_POINT: 'Point'}
.{'AREA_OR_POINT: 'Area'}
.float32
and have nodata np.nan
. Although this can increase data size of certain rasters (SRTM is distributed in which pixels are recorded as integers), this ensures (a) easy comparison across DEMs and (b) no side-effects of the stitcher due to dtypes and/or nodata values. There is one caveat: the user can ensure that DEM nodata pixels are set to 0
using merge_nodata_value
in stitch_dem
, in which case 0
is filled in where np.nan
was. We note specifying this "fill value" via merge_nodata_value
does not change the nodata value of output DEM dataset (i.e. nodata
in the rasterio profile will remain np.nan
). When transforming to ellipsoidal heights and setting 0
as merge_nodata_value
, the geoid values are filled in the DEMs nodata areas; if the geoid has nodata in the bounding box, this will be the source of subsequent no data. For reference, this datatype and nodata is specified in merge_tile_datasets
in merge.py
. Other nodata values can be specified outside the stitcher for the application of choice (e.g. ISCE2 requires nodata to be filled as 0
).There are some notebooks that illustrate how tiles are merged by comparing the output of our stitcher with the original tiles.
As a performance note, when merging DEM tiles, we merge the needed tiles within the extent in memory and this process has an associated overhead. An alternative approach would be to download the tiles to disk and use virtual warping or utilize VRTs more effectively. Ultimately, the accuracy of the final DEM is our prime focus and these minor performance tradeoffs are sidelined.
We assume that the supplied bounds overlap the standard lat/lon CRS grid i.e. longitudes between -/+ 180 longitude and are within -/+ 90 latitude. If there is a single dateline crossing by the supplied bounds, then the tiles are wrapped the dateline and individually translated to a particular hemisphere dicated by the bounds provided to generate a continuous raster over the area provided. We assume a maximum of one dateline crossing in the bounds you specified (if you have multiple dateline crossings, then stitch_dem
will run out of memory). Similar wrapping tiles around the North and South poles (i.e. at -/+ 90 latitude) is not supported (a different CRS is what's required) and an exception will be raised.
This is almost identical to normal installation:
git clone https://github.com/ACCESS-Cloud-Based-InSAR/dem-stitcher.git
conda env update --file environment.yml
(or use mamba
to speed the install up)python -m pip install -e .
If urls or readers need to be updated (they consistently do) or you want to add a new global or large DEM, then there are two points of contact:
The former is the more likely. When re-generating tiles, make sure to run all tests including integration tests (i.e. pytest tests
). For example, if regenerating glo
tiles, glo-30
requires both resolution parameters (30 meters and 90 meters) and an additional notebook for filling in missing 30 meter tiles. These should be clearly spelled out in the notebook linked above.
For the test suite:
pytest
via conda-forge
pytest tests
There are two category of tests: unit tests and integration tests. The former can be run using pytest tests -m 'not integration'
and similarly the latter with pytest tests -m 'integration'
. Our unit tests are those marked without the integration
tag (via pytest
) that use synthetic data or data within the library to verify correct outputs of the library (e.g. that a small input raster is modified correctly). Integration tests ensure the dem-stitcher
API works as expected, downloading the DEM tiles from their respective servers to ensure the stitcher runs to completion - the integration tests only make very basic checks to ensure the format of the ouptut data is correct (e.g. checking the output raster has a particular shape or that nodata is np.nan
). Our integration tests also include tests that run the notebooks that serve as documentation via papermill
(such tests have an additional tag notebook
). Integration tests will require the ~/.netrc
setup above and working internet. Our testing workflow via Github actions currently runs the entire test suite except those tagged with notebook
, as these tests take considerably longer to run.
We welcome contributions to this open-source package. To do so:
We use flake8
and associated linting packages to ensure some basic code quality (see the environment.yml
). These will be checked for each commit in a PR. Try to write tests wherever possible.
This tool was developed to support cloud SAR processing using ISCE2 and various research projects at JPL. The early work of this repository was done by Charlie Marshak, David Bekaert, Michael Denbina, and Marc Simard. Since the utilization of this package for GUNW generation (see this repo), a subset of the ACCESS team, including Joseph (Joe) H. Kennedy, Simran Sangha, Grace Bato, Andrew Johnston, and Charlie Marshak, have improved this repository greatly. In particular, Joe Kennedy has lead the inclusion/development of actions, tests, packaging, distribution (including PyPI and conda-forge
) and all the things to make this package more reliable, accessible, readable, etc. Simran Sangha has helped make sure output rasters are compatible with ISCE2 and other important bug-fixes.