Closed mgunyho closed 1 month ago
this looks like a documentation bug: we can't really sort non-string names alphabetically, so instead we should remove that claim. PRs welcome!
Makes sense, I was also a bit surprised to find this inconsistent behavior discussed in that issue comment.
I suppose the correct wording would be something like "the dimensions are in the order in which they appear in the DataArrays
in the dataset"? This seems to be the behavior, based on trying different orders of the dictionary elements in this example:
import xarray as xr
ds = xr.Dataset({
"foo": xr.DataArray(coords=[("x", [1, 2, 3]), ("y", [1, 2, 3])]),
"bar": xr.DataArray(coords=[("y", [1, 2, 3]), ("x", [1, 2, 3])]),
"baz": xr.DataArray(coords=[("x", [1, 2, 3])]),
"qux": xr.DataArray(coords=[("y", [1, 2, 3])]),
})
print(ds.to_dataframe())
We used to sort dimension names in Dataset.dims
, which in turn were used by DataFrame levels. This is no longer the case: https://github.com/pydata/xarray/pull/4753
So yes, this is definitely worthy of updating/fixing the documentation!
I suppose the correct wording would be something like "the dimensions are in the order in which they appear in the
DataArrays
in the dataset"? This seems to be the behavior, based on trying different orders of the dictionary elements in this example:
I would say Dimensions appear in the same order as Dataset.sizes
(which is also order of appearance on variables)
What happened?
Hi, I noticed that the documentation for
Dataset.to_dataframe()
says that "by default, dimensions are sorted alphabetically". This is contrast withDataArray.to_dataframe()
, where the order is given by the order of the dimensions in theDataArray
, which was discussed in this comment.However, it appears that
Dataset.to_dataframe()
doesn't in fact sort the orders alphabetically with this example on current main 8f6e45ba:I get
What did you expect to happen?
The dimensions in the output should be sorted alphabetically, like this:
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment