Open momchil-flex opened 2 years ago
Thanks for the report @momchil-flex. That's definitely a regression.
However, I wonder what should we do: depreciate interpreting tuples as sequences and always consider them as "scalar" values or continue interpreting it differently depending on the cases?
For example, tuples indexer values were (and still are) assumed to be single element values when selecting on a dimension coordinate with a multi-index (although eventually the multi-index dimension coordinate might be depreciated in xarray):
da = xr.DataArray(
data=range(3),
dims="x",
coords={"a": ("x", ["a", "a", "c"]), "b": ("x", [0, 1, 2])},
).set_index(x=["a", "b"])
da
# <xarray.DataArray (x: 3)>
# array([0, 1, 2])
# Coordinates:
# * x (x) object MultiIndex
# * a (x) <U1 'a' 'a' 'c'
# * b (x) int64 0 1 2
da.sel(x=("a", 1))
# <xarray.DataArray ()>
# array(1)
# Coordinates:
# x object ('a', 1)
# a <U1 'a'
# b int64 1
Pros of always treating a tuple as 1-element indexer value:
Cons:
PandasIndex
and PandasMultiIndex
built-in Xarray, we have no control on 3rd party indexes. Unless we somehow formalize the semantics of the indexer values passed in .sel()
, but this could be challenging as there could be many kinds of indexers (scalar types, tuples, lists, slices, numpy arrays, xarray Variable
or DataArray
objects, etc.).I like the idea of just passing tuples through and letting the index deal with it. Just like a MultiIndex, there may be other cases where this makes sense.
For the current PandasIndex
maybe we can raise a nicer error in .sel
?
What happened?
Version 2022.6.0 produces an error when I try something like
data_array.sel(coordintate=(val1, val2))
. Now this only works if the sequence values are provided as a list instead.What did you expect to happen?
In previous versions, tuples could also be supplied. However, I've been digging into this a bit, and I understand that there are generally some limitations on using tuples (or rather, they are sometimes overloaded). For example, it seems that in any version, I can't use a tuple as an input coordinate to initialize a
DataArray
, as I get an errorCould not convert tuple of form (dims, data[, attrs, encoding])
(this is known). I wanted to report the current bug however since the behavior is different in 2022.6.0 compared to previous versions, and to clarify whether not supporting tuples assel
coordinates is expected or not. It is not very clear from the error message and from the docs. The example below works on < 2022.6.0 but raises an error on 2022.6.0.Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment