tsutterley / pyTMD

Python-based tidal prediction software
https://pytmd.readthedocs.io
MIT License
134 stars 40 forks source link

refactor: change `'geotiff'` to `'GTiff'` and `'cog'` for #320 #321

Closed tsutterley closed 3 months ago

codecov[bot] commented 3 months ago

Codecov Report

Attention: Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.

Project coverage is 69.35%. Comparing base (3d1d88a) to head (3c8c3e4). Report is 3 commits behind head on main.

Files Patch % Lines
pyTMD/spatial.py 0.00% 1 Missing and 1 partial :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #321 +/- ## ========================================== + Coverage 69.17% 69.35% +0.18% ========================================== Files 39 39 Lines 9365 10566 +1201 Branches 1322 1473 +151 ========================================== + Hits 6478 7328 +850 - Misses 2467 2762 +295 - Partials 420 476 +56 ``` | [Flag](https://app.codecov.io/gh/tsutterley/pyTMD/pull/321/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tyler+Sutterley) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/tsutterley/pyTMD/pull/321/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tyler+Sutterley) | `69.35% <0.00%> (+0.18%)` | :arrow_up: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=Tyler+Sutterley#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

robbibt commented 3 months ago

Hey @tsutterley - I saw COG and was immediately intrigued... we do most of our satellite EO analyses using cloud-friendly COG raster data, and the fact that most of the tide models we use come in non-cloud friendly NetCDFs is one of our biggest blockers.

What's your use case for COG support work in pyTMD? Is this for specific kinds of external inputs, or theoretically could it support us converting our existing model constituent files into COG format in the future and accessing them remotely via cloud storage? (I imagine that would be tricky, as we'd lose all of the extra attributes/coordinates that come with the NetCDF files...)

tsutterley commented 3 months ago

Hey @robbibt, Currently the idea was to allow the correction of external inputs (possibly on s3 or other cloud stores using e.g. /vsis3/).

We could look into making the model constituent files into cogs and using something like a STAC catalog to contain the constituent metadata. Wasn't my initial idea, but could be looked into.

robbibt commented 2 months ago

No worries - this was mainly just a curiosity question really! Adding support for COG constituent files might be something I could potentially look at as a contribution back to pyTMD... the more we use this for continental/global scale analyses, the more valuable it would be to have the data in cloud-friendly formats (especially for fast windowed reads into specific areas without having to load in the entire files!).

(ping @alexgleith - I think we've talked about this at some point)

alexgleith commented 2 months ago

Hey @robbibt, I actually think this is a great example for a Zarr.

I think a single Zarr could hold all the variables and could then be accessed over the network, and work for anywhere in the world.

robbibt commented 2 months ago

Hey @robbibt, I actually think this is a great example for a Zarr.

I think a single Zarr could hold all the variables and could then be accessed over the network, and work for anywhere in the world.

Yep, good point! That would be super neat, and probably a lot more compatible with the current NetCDF approach (e.g. we'd still have all the same coordinates etc available).

alexgleith commented 2 months ago

I'd love to get funding to do a proof of concept.

Another issue that blocks this being used for FES, at least, is the data license...

robbibt commented 2 months ago

Yeah, we likely wouldn't be able to provide open access to a Zarr publicly to the world, but I think that would be OK - it would still be up to downstream users to manage access to the files according to the licences of each product (e.g. setting specific permissions etc). But having the tooling available to do it would be a great step forward!

tsutterley commented 2 months ago

Agreed on the licensing issue, which would limit the public release of any cloud-optimized/cloud-native format. But adding "lazy loading" of the tidal constituents could definitely optimize the bottlenecks. Would be an interesting test! :)