casangi / xradio

Xarray Radio Astronomy Data IO
https://xradio.readthedocs.io/en/latest/
Other
9 stars 5 forks source link

The processing set partition can not be written back to disk using `xarray.dataset.to_zarr` #216

Open maneesh29s opened 1 month ago

maneesh29s commented 1 month ago

When processing set is read using read_processing_set, we get xarray Datasets corresponding to each partition.

Each dataset contains the data from the MAIN zarr group. The dataset may also contain data from ANTENNA and POINTING groups, stored as attributes. In all our test datasets, both antenna_xds and pointing_xds were present as attributes.

Xarray's Dataset has a method to_zarr() which can be used to store the dataset in a zarr file format. Calling to_zarr() on any of the read datasets throws this error 

TypeError: Invalid attribute in Dataset.attrs.
maneesh29s commented 1 month ago

To explore this issue further, using the xarray.open_zarr(), we read only the MAIN zarr group inside a partition of the processsing set, which also returned an xarray Dataset object. When we called dataset.to_zarr() on this MAIN dataset, it was stored in a zarr file successfully.