casangi / xradio

Xarray Radio Astronomy Data IO
https://xradio.readthedocs.io/en/latest/
Other
12 stars 7 forks source link

MeasurementSetXds has a .sel() method that shadows Dataset.sel() with a different interface #293

Open v-morello opened 4 weeks ago

v-morello commented 4 weeks ago

Helllo there,

I upgraded to v0.0.42 today, and I was surprised that the following code snippet does not work anymore on a MeasurementSetXds:

xds.sel({"polarization": ["XX", "YY"]})

yields:

     67 if data_group_name is not None:
     68     sel_data_group_set = set(
---> 69         self.attrs["data_groups"][data_group_name].values()
     70     )
     72     data_variables_to_drop = []
     73     for dg in self.attrs["data_groups"].values():

TypeError: unhashable type: 'dict'

The reason is that I think I am calling Dataset.sel, but xradio forces me to call MeasurementSetXds.sel() which has a different interface and use case:

def sel(self, data_group_name=None, **kwargs):

I assume this is accidental, but anyway this method needs to have a different name, overriding any of the fundamental Dataset methods is not something we can afford in general.

v-morello commented 4 weeks ago

I had a look at the code and the core problem is that MeasurementSetXds.sel() does two things instead of one:

  1. Optionally, keep only the data variables from the data group you want to select, data group being a concept specific to xradio.
  2. Call Dataset.sel()

Suggested solution: get rid of that specialized sel(), instead there should be a select_data_group() method that does only that, and that can be chained independently with Dataset.sel(). Both of these calls should then work:

xds.select_data_group("corrected").sel({"polarization": ["XX", "YY"]})
xds.sel({"polarization": ["XX", "YY"]}).select_data_group("corrected")