bopen / xarray-sentinel

Xarray backend to Copernicus Sentinel-1 satellite data products
Apache License 2.0
221 stars 22 forks source link

xr.open_dataset user experience on S1-SLC #4

Closed corrado9999 closed 2 years ago

corrado9999 commented 3 years ago

Issue to analyse the possible behaviour when calling xr.open_dataset on Sentinel 1 SLC data.

corrado9999 commented 3 years ago

For acquisition modes with multiple bursts (all but stripmap), bursts "live" in separate spaces, because they differ at least in the range or in the azimuth dimension. Thus, we cannot put them in a single dataset. When Xarray will devise a data structure including multiple datasets we will be able to exploit it but, for the moment, we will expose each bursts as one group.

We are left with the problem of how to tell the user which groups are available.

<xarray.Dataset>
Dimensions:  ()
Data variables:
    *empty*
Attributes: (12/13)
    ...                         ...
    groups:                     ['orbit', 'IW1', 'IW2', 'IW3']
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-2-bfe1a817db51> in <module>
----> 1 xr.open_dataset("tests/data/S1B_IW_SLC__1SDV_20210401T052622_20210401T052650_026269_032297_EFA4.SAFE")

~\miniconda3\envs\xr-sentinel\lib\site-packages\xarray\backends\api.py in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, backend_kwargs, *args, **kwargs)
    507
    508     overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 509     backend_ds = backend.open_dataset(
    510         filename_or_obj,
    511         drop_variables=drop_variables,

~\xarray-sentinel\xarray_sentinel\sentinel1.py in open_dataset(self, filename_or_obj, drop_variables, group)
    155     ) -> xr.Dataset:
    156         if group is None:
--> 157             raise NotImplementedError("Cannot access to root dataset, please select one of the groups: orbit, IW1/1, IW1/2, ..., IW2/1, IW2/2, ..., IW3/1, IW3/2, ...")
    158         elif group == "gcp":
    159             ds = open_gcp_dataset(filename_or_obj)

NotImplementedError: Cannot access to root dataset, please select one of the groups: orbit, IW1/1, IW1/2, ..., IW2/1, IW2/2, ..., IW3/1, IW3/2, ...
alexamici commented 3 years ago

I just realised there is a possible:

Example usage:

>>> slc = xr.open_dataset(".../manifes.safe", engine="sentinel1")
>>> list(slc.sentinel.swaths)
["IW1", "IW2", "IW3"]
>>> iw1 = slc.sentinel.swaths["IW1"]
>>> iw1
<xarray.Dataset>
...
>>> list(iw1.sentinel.bursts)
["N430_W0120_VV", ...]
>>> iw1.sentinel.bursts["N430_W0120_VV"]
<xarray.Dataset>
...
>>>

Or for a flatter experience:

>>> slc = xr.open_dataset(".../manifes.safe", engine="sentinel1")
>>> list(slc.sentinel.dataset)
[“IW1/orbit", “IW1/gcp", “IW1/N430_W0120_VV", ...]
>>> slc.sentinel.dataset[“IW1/N430_W0120_VV"]
<xarray.Dataset>
...
>>>
corrado9999 commented 3 years ago

Option 4 looks nice, but AFAIK it has a main drawback: the accessor will be present on every dataset, independetly on whether it has been opened with sentinel_xarray or not, which looks pretty odd.

alexamici commented 3 years ago

W00t! You are right. That makes option 4 quite ugly, indeed.

alexamici commented 2 years ago

Current agreement with @aurghs is to map the data to open_dataset group option and leave it as similar as possible to the original structure. The structure is described in https://github.com/bopen/xarray-sentinel/blob/main/docs/DATATREE.md

Once xarray solves https://github.com/pydata/xarray/issues/4118 we may reassess (tracked by #60).

I would propose to close this issue.