lauraduncanson / icesat2_boreal

Biomass modeling and mapping of forest biomass in the boreal using NASA's ICESat-2
14 stars 7 forks source link

Need to correctly handle topo stack calcs when using `build_stack.py`, `write_cog`, and `make_topo_stack_cog` #64

Open pahbs opened 4 months ago

pahbs commented 4 months ago

@wildintellect build_stack.py is a tool to [1] build raster stacks for polygon extents and grids that we want, and [2] save them as COGs. There are a million ways to do this. This tool has worked well to incorporate s3 land cover, tree cover, and DEM datasets into our boreal biomass mapping workflow. We need to make sure they work perfectly for all, which means we need to fix how topo stacks are handled while maintaining currently functionality for non-topo datasets.

issue:
we need to re-configure how topo stacks are handled in build_stack - because topography layers like slope and aspect should be calc'd on north-up data (also, we want bilinear or cubic resampling - easy to update).

[1] build_stack in topo mode first mosaics all intersecting DEM tiles for a given boreal tile using mosaic_reader (sure, we could use gdal.BuildVRT here but it doesnt get at the heart of the issue). [2] it writes that mosaic (native north-up prj) to a COG using write_cog from CovariateUtils.py - which will be a tmp file [3] then uses CovariateUtils_topo.make_topo_stack_cog to run the gdal.DEMProcessing steps - where recent updates have this working correctly based on latitude. [4] It writes out a final topo stack COG that must be clipped to the clip_geom

trick:
handling correctly the sequence of mosaicing, tmp cog writing (cuz, gdal...), topo calcs, then final cog writing into the correct prj AND with the correct clip geometry.

The issue is documented here: /projects/shared-buckets/montesano/test_build_stack_topo.ipynb, where we start off by reading in the tmp topo mosaic (step [2] above).

  1. The bottom chunks show what needs to be done within write_cog to make this work - but - how to implement this properly?
  2. Why is the in-memory clipping that is done with rio_tiler.utils.create_cutline in write_cog doing nothing? https://github.com/lauraduncanson/icesat2_boreal/blob/b6e19887587c014afaf66f60f3ddfd453eb9d189/lib/CovariateUtils.py#L237
    • this issue is just being discovered now b/c previously data coming into make_topo_stack_cog was already in the out_crs
pahbs commented 4 months ago

A fix has been implemented that resulted in updates to build_stack.py, CovariateUtils.write_cog(), and CovariateUtils_topo.make_topo_stack_cog().

The fixes solve:

  1. the large nodata extent around tiles (now everything is clipped to tile extent)
  2. the jagged validdata/nodata pattern at tile borders (this was introduced from create_cutline in write_cog, which has been set behind if False: for potential further development.

Main features of the fix include:

  1. for topo runs:
    • (build_stack) changed the input to mosaic_reader so no reprojection is performed. This preserves north-up data for call to make_topo_stack_cog()
    • (build_stack) changed how in_bbox is determined - now uses the topo data's native projection - and uses total_bounds iftile is in a dateline_tile_list
    • (write_cog) will no longer attempt to clip during MemoryFile stage - this wasnt working as expected. Now solely handled with align=True
    • (write_cog) now has a resampling arg that defaults to nearest
    • (make_topo_stack_cog) now performs 2 writes [1] with write_cog to reduce the extent a bit (but not entirely) and apply cubic resampling, then [2] reading back in and COGReader which can do the final clip needed, then writing to a COG using cog_profiles and rasterio directly.
  2. For other runs (topo_off) not much has changed with how things work.

The fix is being tested locally on mini and large dateline tiles, and regular tiles for land cover, tree cover and topo input.

wildintellect commented 4 months ago

@pahbs would be great if you could link to a commit or PR that shows the changes. We should also verify your outputs with rio cogeo validate

pahbs commented 4 months ago

@pahbs would be great if you could link to a commit or PR that shows the changes. We should also verify your outputs with rio cogeo validate

Here is the tag associated with the most recent bug fixes to the build_stack for topo run: build_stack_v2024_2.

This tag includes the bug fix to write correctly to a cog using cog_translate at the final stage of make_topo_stack_cog.

pahbs commented 4 months ago

Note:

TODO:

Known bugs: