zarr-developers / geozarr-spec

This document aims to provides a geospatial extension to the Zarr specification. Zarr specifies a protocol and format used for storing Zarr arrays, while the present extension defines conventions and recommendations for storing multidimensional georeferenced grid of geospatial observations (including rasters).
106 stars 10 forks source link

Roundtrip of geotiff/nc to zarr to geotiff/nc as test ground to find what info needs to be saved #50

Open felixcremer opened 2 weeks ago

felixcremer commented 2 weeks ago

In one of the last meetings we talked about a roundtrip between python and Julia to test the implementations. Here I am proposing that we should also look at a roundtrip between data formats to see what is needed in geozarr to make this roundtrip possible without loss of information and therefore what is needed to save geotiff or netcdf-like datasets in a zarr file. We could start with a geotiff file, open it in the software of choice, save it to a geozarr, open it again and save it in a geotiff. After this roundtrip we should get the same geotiff file that we started with. The same should be done with a NetCDF file instead of a geotiff file. This way we would learn what we need to save in a geozarr to save the different data models without loss of information.

What are good example files for this kind of test? Are there relevant collections of small test files for netcdf and geotiff available? For NetCDF if found https://www.unidata.ucar.edu/software/netcdf/examples/files.html

For GeoTIFF I found https://github.com/GeoTIFF/test-data Can we make a selection of these files to capture most of the common use cases?