xcube server STAC implementation, further information needed to access data

konstntokas commented 4 months ago

Is your feature request related to a problem? Please describe. So far, the xcube server STAC implementation does not give further information regarding the store parameters to access the data. This will be needed if a client wants to access the data linked by an item's asset in the STAC catalog. See xcube viewer's data access snippet below:

from xcube.core.store import new_data_store

store = new_data_store(
    "s3",
    root="datasets",  # can also use "pyramids" here
    storage_options={
        "anon": True,
        "client_kwargs": {
            "endpoint_url": "http://localhost:8080/s3"
        }
    }
)
# store.list_data_ids()
dataset = store.open_data(data_id="zarr_file.zarr")

Describe the solution you'd like Add a new field to the asset called something like "xcube:open_kwargs". We can get inspiration from the item https://planetarycomputer.microsoft.com/api/stac/v1/collections/era5-pds/items/era5-pds-1980-01-fc, which stores extra information in "xarray:open_kwargs" for opening the data. Similarly, we could add

xcube:open_kwargs =  dict(
    root="datasets",
    endpoint_url="http://localhost:8080/s3"
)

forman commented 4 months ago

Similarly, we could add

We should provide the parameters that would allow users using the xcube data store framework. Therefore we need the following information:

data_store_id
data_store_params
data_id
open_data_params

If we stick to the datasets published by the same xcube Server instance that also provides the STAC API we may boil it down to just the S3 API parameters.

konstntokas commented 4 months ago

With the PR #1029 I can now read zarr, levels, geotiffs and cog-geotiffs from the STAC published by xcube server. Two questions remain:

When starting the server, all files are published as zarrs. See xcube/webapi/ows/stac/controllers.py#L800 Why is that? This also means that the levels file is presented as a dataset instead of a mldataset.
When adding a netcdf file to the server configuration, something goes wrong. Also the viewer does not show the datasets. I see in the examples/serve/demo/config.yml that the cube.nc is not assigned. Does this mean that the server cannot publish netcdf files?

konstntokas commented 4 months ago

The PR #1029 is converted to a draft. First the MIME-type and format extension of the dataset will be added to the asset. So far all files will be published as zarrs.

konstntokas commented 4 months ago

The xcube server publishes each dataset as .zarr and .levels on the s3/datasets and s3/pyramidsendpoint, respectively. Two assets will be published, namely analytic (asset as before) and analytic_multires, linking to the dataset and the multi-level dataset, respectively.

xcube-dev / xcube

xcube server STAC implementation, further information needed to access data #1020