pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.63k stars 1.09k forks source link

Taking a list of `ds.dims` does not raise warning #9799

Closed hoxbro closed 2 days ago

hoxbro commented 2 days ago

What is your issue?

Take the following example:

import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature')

ds.dims["time"]  # Emits: The return type of `Dataset.dims` will be changed to return a set of ...
list(ds.dims)  # Does not emit a warning

I think __iter__ should also emit a warning, but I'm unsure if the decision not to do it is deliberate.

keewis commented 2 days ago

I think that's deliberate: we're trying to transition ds.dims from a dict to a set. The warning should thus only be emitted when trying to interact with ds.dims as a dict (such as when accessing the data through __getitem__), while __iter__ is also defined on a set and thus will continue to work after completing the deprecation cycle – although the question of reproducing the ordering remains for sets.

hoxbro commented 2 days ago

The keys in a dict are ordered, so if I do something like list(ds.dims)[0] or a variation of this, it will break (silently) when the switch happens.

But if this has already been discussed and a decision has been made, you can close the issue :+1:

shoyer commented 2 days ago

In principle, we could return an ordered set object from ds.dims in the future. We already use own wrapper objects for other dict-like properties in Xarray, mostly to add immutability.

TomNicholas commented 2 days ago

The relevant PR was https://github.com/pydata/xarray/pull/8500. Which was almost a year ago, so maybe we should think about completing this deprecation cycle soon? I believe @keewis is correct, but agree with @shoyer that we might actually prefer to return an OrderedSet.

keewis commented 2 days ago

I also think we should use a OrderedSet, as otherwise maintaining doctests for Dataset will become very tricky (plus I recently learned that we use ds.dims as the default dimension order in Dataset.to_dataframe, so this definitely shouldn't be effectively random)

TomNicholas commented 2 days ago

If we're in agreement then I think the original complaint won't apply and we can close this?