opendatacube / datacube-core

Open Data Cube analyses continental scale Earth Observation data through time
http://www.opendatacube.org
Apache License 2.0
505 stars 176 forks source link

CRS: CF grid_mapping attribute to encoding #1084

Closed snowman2 closed 1 year ago

snowman2 commented 3 years ago

Related to:

Apparently it is not CF-compliant to store the CRS grid_mapping as a coordinate. However, it is desirable to keep it in the xarray coords as it persists better. The workaround is to store the grid_mapping in the encoding instead of attrs and to_netcdf will take care of the rest when writing to a netCDF file.

snowman2 commented 3 years ago

Note: This requires Python 3.7+ and xarray 0.17+

Kirill888 commented 3 years ago

@snowman2 these are all pretty long threads. Can I ask you to summarize what would you like to propose?

I'm open to a suggestion to

I'm not too keen to drop coords-based .geobox though, as this is the most robust way we found so far for preserving geo-registration information through data transformations.

snowman2 commented 3 years ago

The only change is to store grid_mapping in encoding instead of attrs.

rds.encoding["grid_mapping"] = "spatial_ref"
snowman2 commented 3 years ago

This PR is how rioxarray handled it: https://github.com/corteva/rioxarray/pull/284

snowman2 commented 3 years ago

I'm not too keen to drop coords-based .geobox though, as this is the most robust way we found so far for preserving geo-registration information through data transformations.

I agree with you 💯

Kirill888 commented 3 years ago

Is this about data var attributes only or x,y coords too?

Kirill888 commented 3 years ago

https://github.com/opendatacube/datacube-core/blob/80d466a2635ab37fbe33bb283b77df305e3c9236/datacube/api/core.py#L465-L473

https://github.com/opendatacube/datacube-core/blob/80d466a2635ab37fbe33bb283b77df305e3c9236/datacube/utils/geometry/_base.py#L1458-L1462

snowman2 commented 3 years ago

Should just be for the data vars.

snowman2 commented 3 years ago

Remembered this today with something else I was working on. Is this a change you would like to see?

Kirill888 commented 3 years ago

@snowman2 I forgot about this one.

Ideally I would like to see this code taken out into a separate library under opendatacube org, as you were suggesting previously, it would make it easier to iterate on.

Looks like this work might be done under https://github.com/opendatacube/odc-tools/projects/2, as there is interest in decoupling data loading code from collection management. Moving geometry handling code out into a standalone library would be a first step toward that goal.

I'm yet to write an "Enhancement Proposal" for that task though. I'm thinking of doing some serious git history surgery on datacube repo to get geometry code and tests into a separate lib while keeping original authorship and history. Followed up by changing datacube to depend on that lib instead of internal implementation.

I believe this a more promising approach, that still delivers tangible benefits, than currently stalled datacube 2.0 effort.

cc: @gadomski

snowman2 commented 3 years ago

Sounds great 👍.

gadomski commented 3 years ago

I'm thinking of doing some serious git history surgery on datacube repo to get geometry code and tests into a separate lib while keeping original authorship and history.

For background @Kirill888 we used git-filter-repo when we broke out the stactools subpackages (https://github.com/stac-utils/stactools/pull/111), and it worked well for us. Relevant line here: https://gist.github.com/gadomski/a0ab8d18f56ff34454d2ed890d234534#file-extract_stactools_subpackage-sh-L21.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

snowman2 commented 2 years ago

Ping to leave open.

Kirill888 commented 2 years ago

@snowman2 I pinned this so it won't expire. I'm starting the work of extracting odc.geo library from datacube-core. The plan is to

  1. Move geometry and related tests to a separate package and repository under opendatacube org (with history)
  2. Setup all the necessary infra: testing, packaging, docs in the new repository
  3. Do cleanup: formatting, docs, types, test coverage
  4. Replace datacube dependency in odc-stac with the odc-geo, this is likely to highlight various missing bits, so will require some iteration. It's likely that some parts from odc.algo will migrate to odc.geo as part of this work too (reprojection logic in particular).
  5. Finally refactor datacube to depend on odc.geo rather than having duplicate code

So it will take a while before datacube itself is updated, but work is starting now.

snowman2 commented 1 year ago

I believe that this is addressed by #1424