bopen / xarray-sentinel

Xarray backend to Copernicus Sentinel-1 satellite data products
Apache License 2.0
222 stars 22 forks source link

Add an API to access as much original metadata as possible #7

Open alexamici opened 3 years ago

alexamici commented 3 years ago

Within xarray there is no easy way to expose the full XML metadata in the files inside the annotation folder.

Options:

Note that the I didn't find any XSD for the manifest.safe, so we may need to keep to option to return an ElementTree representation of the XML.

aurghs commented 3 years ago
  • add an entries into .attrs that contain the return value of xmlschema.XMLSchema(schema_path).to_dict(annotation_path)

if we store a dictionary in .attrs the data will be not serializable into a netCDF.

corrado9999 commented 3 years ago

As already noted in https://github.com/bopen/xarray-sentinel/issues/4#issuecomment-818500175, the xarray accessor would be present in every dataset, not only those opened through xarray-sentinel.

Nonetheless I believe an accessor-like interface gives the best user experience, could we just attach a (python) attribute to the dataset? You would lose it if you dump the dataset to a netCDF (or any other format), but I think all the other options would do the same.

alexamici commented 3 years ago

I agree, the accessor is the only way to go. Especially if we want to support specialised exploration APIs that perform possibly slow operations (for example when data is over the network).

corrado9999 commented 3 years ago

Mmm... we were probably not talking about the same thing, By "accessor-like interface" I meant something that acts as an accessor (you just call e.g. ds.sentinel1) but is not an accessor, just an attribute attached to the object. I admit it, it is not very nice, but I find having the accessor sentinel on every dataset is really a no-go.

alexamici commented 3 years ago

AttributeError: cannot set attribute 'sentinel1' on a 'Dataset' object.

I just tried. You cannot add a new attribute to a Dataset. That is because you can access data variables and coordinates by attribute ds.latitude I think.

Looks like the accessor API is the only way to extend xarray objects.

alexamici commented 3 years ago

Thinking more about the accessor-interface I'm now of the opinion that there's no clean way to add the exploration API to the Dataset object because it is not the right place for it.

For what it is worth, my suggestion is to go with option 3 above:

provide metadata exploration API functions in xarray_sentinel