zarr-developers / geozarr-spec

This document aims to provides a geospatial extension to the Zarr specification. Zarr specifies a protocol and format used for storing Zarr arrays, while the present extension defines conventions and recommendations for storing multidimensional georeferenced grid of geospatial observations (including rasters).
106 stars 10 forks source link

Align with XCube #21

Closed rabernat closed 10 months ago

rabernat commented 1 year ago

At our last call I mentioned the amazing xcube project, which shares very similar goals to @christophenoel's GeoZarr work at SpaceBel

xcube is an open-source Python package and toolkit that has been developed to provide Earth observation (EO) data in an analysis-ready form to users. xcube achieves this by carefully converting EO data sources into self-contained data cubes that can be published in the cloud.

Tagging @forman for potential coordination, particularly around #18. It would be great to have interoperability between different implementations of serverless Zarr datacubes.

christophenoel commented 1 year ago

While xcube does share some similarities with GeoZarr, such as relying on CF conventions, I believe that their objectives are not entirely aligned. xcube has many constraints, it is quite opinionated and it doesn't target a serverless apprpoach (as it is essentially a server implementation).

One of many example is that it is focus on multispectral raster (e.g. "Dimensions SHALL be at least time, bnds"). Also, it is not meant to hold and describes multiple chunkings, projections, scales, etc.

xcube might be an interesting option for creating a server on top of Zarr (it aims to provides server with APIs), rather than focusing on direct data access and visualization within Zarr itself. Based on tests conducted two years ago, supports only a limited number of source data products.

However, exploring how CRS and stuff are encoded is still a good idea.

forman commented 1 year ago

@rabernat thanks for referring to xcube here. I confirm, we are happy to provide feedback on the GeoZarr spec.

@christophenoel

it doesn't target a serverless apprpoach

But it also doesn't target a server-ful approach. We just try to harmonize the schemas of geo-spatial data cubes so they can be easily understood and used by our users (mostly Earth scientists and Earth data providers familiar with the Pangeo technology stack).

Also, it is not meant to hold and describes multiple chunkings, projections, scales, etc.

The xcube dataset spec is quite outdated (~5 years old, omg). I will update it with a more recent version that is actually in use by one of the many projects where we use xcube. It covers many of the topics you are currently missing, including multi-resolution datasets. I'll do that asap and link it here.

However, exploring how CRS and stuff are encoded is still a good idea.

For this, we rely on and comply to CF conventions.

christophenoel commented 1 year ago

I want to clarify that the intent of my message is not at all to criticize xcube, which is a tool that has already caught my attention.

The point I am trying to make is that, like xarray, xcube is primarily a software library that aims to leverage the Zarr format and has introduced necessary conventions since there are no official ones. On the other hand, GeoZarr is an attempt to create a standard data model that we hope will be adopted by all tools.

Without starting from scratch, it is good to draw inspiration from xcube, as well as from xarray and GDAL, without necessarily considering alignment as a binding objective, so that we can pursue our own goals.

Of course, I imagine that the input from @forman would be very interesting and relevant in developing the GeoZarr conventions.

forman commented 1 year ago

@christophenoel As promised, we've updated our convention to reflect the current state as used in our projects:

christophenoel commented 1 year ago

Hi Forman,

Thank you for letting me know that you have updated your convention to reflect the current state used in your projects. I appreciate the effort and time you have put into this.

I have quickly reviewed the updated convention and it looks very exhaustive, and of very high quality. Your work will be a valuable base for all the aspects that you have covered.

christophenoel commented 10 months ago

@forman Alignment has been reflected in the OGC SWG charter. Thanks again for the information.