xcube-dev / xcube

xcube is a Python package for generating and exploiting data cubes powered by xarray, dask, and zarr.
MIT License
179 stars 19 forks source link

xcube server STAC implementation, further information needed to access data #1020

Open konstntokas opened 1 week ago

konstntokas commented 1 week ago

Is your feature request related to a problem? Please describe. So far, the xcube server STAC implementation does not give further information regarding the store parameters to access the data. This will be needed if a client wants to access the data linked by an item's asset in the STAC catalog. See xcube viewer's data access snippet below:

from xcube.core.store import new_data_store

store = new_data_store(
    root="datasets",  # can also use "pyramids" here
        "anon": True,
        "client_kwargs": {
            "endpoint_url": "http://localhost:8080/s3"
# store.list_data_ids()
dataset = store.open_data(data_id="zarr_file.zarr")

Describe the solution you'd like Add a new field to the asset called something like "xcube:open_kwargs". We can get inspiration from the item https://planetarycomputer.microsoft.com/api/stac/v1/collections/era5-pds/items/era5-pds-1980-01-fc, which stores extra information in "xarray:open_kwargs" for opening the data. Similarly, we could add

xcube:open_kwargs =  dict(
forman commented 3 days ago

Similarly, we could add

We should provide the parameters that would allow users using the xcube data store framework. Therefore we need the following information:

If we stick to the datasets published by the same xcube Server instance that also provides the STAC API we may boil it down to just the S3 API parameters.

konstntokas commented 9 hours ago

With the PR #1029 I can now read zarr, levels, geotiffs and cog-geotiffs from the STAC published by xcube server. Two questions remain:

  1. When starting the server, all files are published as zarrs. See xcube/webapi/ows/stac/controllers.py#L800 Why is that? This also means that the levels file is presented as a dataset instead of a mldataset.
  2. When adding a netcdf file to the server configuration, something goes wrong. Also the viewer does not show the datasets. I see in the examples/serve/demo/config.yml that the cube.nc is not assigned. Does this mean that the server cannot publish netcdf files?