Closed cmarshak closed 2 years ago
Check out this pull request on
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
I am going to write one more test for the geoid reader.
@asjohnston-asf - when it makes sense, can you help me understand why one of my tests is failing here (i.e. tests/test_rio_tools.py
? I thought it might be because data was correctly uploaded to the repo, then I successfully tried the tests by cloning it on mac and then on a linux server. Honestly, not totally sure how to debug. It looks like the first array is all 0's but it shouldn't be...
The data is here: https://github.com/ACCESS-Cloud-Based-InSAR/dem-stitcher/tree/reproject-and-test/tests/data/rio_tools/update_resolution
Here is one failed action. https://github.com/ACCESS-Cloud-Based-InSAR/dem-stitcher/runs/6564288126?check_suite_focus=true
I temporarily marked it as an integration test so it's ignored.
Maybe it's an action issue?
This is way too large of a pull request for a time-strapped team to review carefully (40 files changed) and so I understand there is going to be some element of trust. I am putting some of the higher level elements of the review.
Also, as apart of this pull request - is the related one here in DockerizedTopsApp. In the pull request, note we pull this branch in the environment file. I ran the code on 4 different areas (their selection was motivated by previous issue tickets) and the resulting GUNWs can be downloaded there. Ultimatley, the 90 meter GUNW doesn't show much of a difference using this new dem.
Most important for @sssangha and @gracebato to take a look at the datasets and just confirm they look reasonable and I am not missing something. The new Dem-stitcher does not extend the origin of DEM tiles if they are not available (this is a feature, not a bug). The resulting GUNW over La Palma is truncated - but its georeferencing is fine as can be inspected.
Within this pull request and now the readme, I tried to explain how the tiles are stitched and transformed.
I also included two notebooks to demonstrate that (1) if no geoid is removed the tiles are subset but not resampled and (2) to compare ISCE2 dem.py to our stitcher.
@jhkennedy and @sssangha - I would like you to provide feedback.
dst_resolution
so that we can specify a target square resolution or (x_res, y_res)
resolution. We make sure that we do not adjust the origin as is frequently done in rasterio
.(X, p)
. This will greatly support ACCESS's "virtual data processing" (processing in memory) since we can pass the arrays to subsequent workflows directly.stitch
function.There are ton of tests that make sure that we are correctly stitching the tiles together. Here are some high level descriptions:
There is also an integration test (which is ignored by github actions) which does reach onto the servers to make a stitched DEM, but makes sure the main stitch
function goes until completion.
Unfortunatley, there were packaging issues (see [1] and [2]) that cropped up due to the changes related to gdal
and rasterio
since I worked with them last that made this pull request harder. It was essential that the DEM stitcher still worked with DockerizedTopsApp which uses conda-forge
packages specifically its gdal
. Long story short, we found that gdal
had to be version <3.4
. We also discovered that the PyPI rasterio
is not installing correctly for python 3.10+ as in [2] (we have an environment that installs a beta release but it is not tested here).
These issues were first exposed after writing all the tests and passing in an environment with an older python worked fine but they would fail on GitHub actions. After some help from @jhkennedy, it was found that all resampling was defaulting to nearest
in certain environments and therefore the above test_rio_tools.py
was failing; this was no good to say the least (wanted our geoid to be resampled smoothly to the DEM tiles!). After some investigations, we have settled on the current solution. I hope that @jhkennedy and @asjohnston-asf might help make this cleaner in the future. Currently, I have two environment files (one for PyPi and one for conda) and they are both tested against the new suite - the key is gdal
(and maybe proj
).
[1] https://github.com/conda-forge/rasterio-feedstock/issues/243 [2] https://github.com/rasterio/rasterio/issues/2473
Please provide feedback if there is time!
To illustrate that the outputs preserve the georeferencing of the original DEM tiles, here is an example output over a small .1 degree area overlapping 4 tiles in Central CA.
Consider the following:
from dem_stitcher.stitcher import stitch_dem, get_dem_tiles
import rasterio
dst_area_or_point = 'Point'
dst_ellipsoidal_height = False
dem_name = 'glo_30'
# Central coast of California
# xmin, ymin, xmax, ymax
bounds = [-120.05, 34.95, -119.95, 35.05]
X, p = stitch_dem(bounds,
dem_name=dem_name,
dst_ellipsoidal_height=dst_ellipsoidal_height,
dst_area_or_point=dst_area_or_point)
height_type = 'ellipsoidal' if dst_ellipsoidal_height else 'geoid'
with rasterio.open(f'{dem_name}_{height_type}_{dst_area_or_point}_CA.tif', 'w', **p) as ds:
ds.write(X, 1)
ds.update_tags(AREA_OR_POINT=dst_area_or_point)
The output generates this.
The source tiles are here
https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N34_00_W121_00_DEM/Copernicus_DSM_COG_10_N34_00_W121_00_DEM.tif
https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N34_00_W120_00_DEM/Copernicus_DSM_COG_10_N34_00_W120_00_DEM.tif
https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N35_00_W121_00_DEM/Copernicus_DSM_COG_10_N35_00_W121_00_DEM.tif
https://copernicus-dem-30m.s3.amazonaws.com/Copernicus_DSM_COG_10_N35_00_W120_00_DEM/Copernicus_DSM_COG_10_N35_00_W120_00_DEM.tif
If you look at the tiles with the output in QGIS, you should see no shift with the "stitched-dem" and the source tiles.
@jhkennedy
environment.yml
gdal
)I hope at some point we can add python 3.10 using conda - but won't worry about it now. Also, we can add conda install instructions once we get it onto conda-forge.
Addresses #32.
(X, p)
whereX
is array andp
is profile