xarray supports the idea of coordinate arrays to support indexing and selection of data.
The MS index columns (TIME, ANTENNA1, ANTENNA2, DATA_DESC_ID) etc. are appropriate to use as coordinate arrays.
It's trivial to convert them from standard xarray arrays to xarray coordinates via Dataset.set_coords
It might useful to support this as standard practice on the xarray-ms, but at some point a decision may need to be made as to whether to support this via numpy or dask arrays.
numpy arrays avoid the above issues but could be very large given the data volumes we are now encountering.
@bennahugo and @mulan-94, before deciding that 4. is a good idea, I'd be interested in hearing your experiences with 1, 2 and 3, if you have the need to use the xarray coordinate functionality for performing selections..
I have a suggestion also. It would be cool if the baselines could have an ID as well. So the index columns could be like TIME, ANTENNA1, ANTENNA2, DATA_DESC_ID and BASELINE_ID :)
It might useful to support this as standard practice on the xarray-ms, but at some point a decision may need to be made as to whether to support this via numpy or dask arrays.
@bennahugo and @mulan-94, before deciding that 4. is a good idea, I'd be interested in hearing your experiences with 1, 2 and 3, if you have the need to use the xarray coordinate functionality for performing selections..