Open scottyhq opened 10 months ago
Hey @scottyhq thanks for the proposal! I've brought it up with the rest of the team and we're definitely interested, but might have trouble prioritizing it. However, our backlog priority is affected by the number of users asking for feature so if you know of anyone else interested in these features, ask them to either "thumbs up" your post or add a comment to the issue and we might be able to get to this sooner.
Background
The outputs from hyp3-isce2 are publicly accessible, which is great for collaboration among research groups. Currently outputs are in S3, but it's unclear how to access them directly. Instead we're using cloudfront links to zipped collections of all the files.
In the interest of a fully-cloud workflow it would be great to have COG images rather than just tiled Geotiff. It would also be helpful to have a STAC Item json per job. And if zip is necessary to save space, use SOZIP.
This combination would allow very efficient construction of data cubes reading hyp3 outputs directly from S3, especially when using a system like OpenScienceLab also in AWS us-west-2.
Describe the solution you'd like
Chatted a bit with @forrestfwilliams about this at AGU and it seemed like a possibility to add at least the STAC generation here (though ultimately it would be useful for any hyp3 processor).
We have some currently messy code here to generate the STAC Item here https://github.com/relativeorbit/agu2023/blob/main/utils.py - let me know if you'd be open to a pull request?
Alternatives
Have another step in the canned HYP3 workflow to generate COG+STAC Item for each job. (replacing
- translate_outputs: Convert the outputs of hyp3-isce2 to hyp3-gamma formatted geotiff files
)Additional context
COGs+STAC would allow taking advantage of some nice existing tools:
Dynamic tiling already "works" with current GeoTiffs with 256x256 tiling, but would be better if pyramid overviews were included: https://titiler.xyz/cog/map?url=/vsizip//vsicurl/https://d3gm2hf49xd6jj.cloudfront.net/cfd3566f-dafe-4859-ab05-39a83f86c98d/S1_245655_IW3_20220727_20230722_VV_INT80_67C8.zip/S1_245655_IW3_20220727_20230722_VV_INT80_67C8/S1_245655_IW3_20220727_20230722_VV_INT80_67C8_wrapped_phase.tif&rescale=-3.14,3.14&colormap_name=hsv
For example, a simple static catalog allows efficient browsing of the outputs (this "works" already, which is pretty neat!, but doesn't do efficient COG tiling): https://radiantearth.github.io/stac-browser/#/external/raw.githubusercontent.com/relativeorbit/agu2023/main/catalog.json
having STAC items allows easy construction of datacubes for postprocessing with Xarray via libraries like
odc-stac
(for example, https://github.com/relativeorbit/agu2023/blob/main/A64-postprocess.ipynb)