ACCESS-Cloud-Based-InSAR / dem-stitcher

Download and merge DEM tiles
Apache License 2.0
39 stars 15 forks source link

Left, bottom boundaries of the DEM are getting rounded to an integer #62

Closed scottstanie closed 10 months ago

scottstanie commented 10 months ago

The bug

While the right/top boundaries are exactly as specified, the left/bottm (western/southern) boundaries seem to get rounded.

To Reproduce

(following the README)

import rasterio
bounds = [176.678207, 50.908962, 179.697601, 52.754662]
X, p = dem_stitcher.stitch_dem(bounds, dem_name='glo_30')
with rasterio.open('dem.tif', 'w', **p) as ds:
    ds.write(X2, 1)
    ds.update_tags(AREA_OR_POINT='Point')
$ rio bounds --bbox dem.tif
[177.0, 51.0, 179.69791666666666, 52.75472222222222]

Additional context

In [36]: rio.show_versions()
rasterio info:
  rasterio: 1.3.8
      GDAL: 3.7.1
      PROJ: 9.2.1
      GEOS: 3.12.0
 PROJ DATA: /Users/staniewi/miniconda3/envs/dem-env/share/proj
 GDAL DATA: /Users/staniewi/miniconda3/envs/dem-env/share/gdal

System:
    python: 3.11.5 | packaged by conda-forge | (main, Aug 27 2023, 03:33:12) [Clang 15.0.7 ]
executable: /Users/staniewi/miniconda3/envs/dem-env/bin/python3.11
   machine: macOS-13.5.1-arm64-arm-64bit

Python deps:
    affine: 2.4.0
     attrs: 23.1.0
   certifi: 2023.07.22
     click: 8.1.7
     cligj: 0.7.2
    cython: None
     numpy: 1.25.2
    snuggs: 1.4.7
click-plugins: None
setuptools: 68.1.2
cmarshak commented 10 months ago

So - I would definitely say this is a shortcoming of this library. That said, this result is currently a feature, not a bug, in that it's output is expected (by me), though it certainly could yield unexpected results for users, particularly if you are expecting a raster covering the requested extent, which many are! The issue in this case is that the bounds you requested fall outside of the what is available with glo-30.

Globally glo-30 tiles look like this:

image

For the requested area (zoomed in), the yellow lines are the glo-30 tiles available and the black is the area you selected:

image

The latter was generated using:

from dem_stitcher import get_overlapping_dem_tiles
from rasterio.crs import CRS
import matplotlib.pyplot as plt
from shapely.geometry import box
import geopandas as gpd

bounds = [176.678207, 50.908962, 179.697601, 52.754662]
df_tiles = get_overlapping_dem_tiles(bounds=bounds,
                                     dem_name='glo_30')
df_bbox = gpd.GeoDataFrame(geometry=[box(*bounds)], crs=CRS.from_epsg(4326))

fig, ax = plt.subplots()

df_bbox.exterior.plot(ax=ax, color='black')
df_tiles.exterior.plot(ax=ax, color='yellow')

ax.set_xlabel('Lon')
ax.set_ylabel('Lat')

So, the rounding is done because the origin and extent is selected on what's available via rasterio. It's based on window functionality (maybe inherited from gdal?): https://rasterio.readthedocs.io/en/stable/topics/windowed-rw.html.

Here is a hypothetical sample:

from shapely.geometry import box

bounds=[<BOUNDS>]
geo = box(*bounds)
with rasterio.open(<IMAGE_PATH>) as src:
    out_image, out_transform = rasterio.mask.mask(src, geo, crop=True)
    out_meta = src.meta

out_meta.update({"driver": "GTiff",
         "height": out_image.shape[1],
         "width": out_image.shape[2],
         "transform": out_transform,
         "compress": "lzw"})

with rasterio.open("<OUT_PATH>", "w", **out_meta) as dest:
    dest.write(out_image) 

I don't have sample data - but basically origin will be shifted if the requested bounds are to the left of source image similar to your example.

Hope this makes sense. At some point, it might be helpful to change the origin depending on requested data and fill with nodata, but currently, left things without additional modification because it's easier (to avoid any resampling would have to figure out origin so that it contained extent and then write the appropriate tests).

Glad you spotted this and hope this can be used in the future to highlight this deficiency of this library.