pydata / xarray

N-D labeled arrays and datasets in Python
https://xarray.dev
Apache License 2.0
3.55k stars 1.07k forks source link

Ordered Groupby Keys #757

Open jhamman opened 8 years ago

jhamman commented 8 years ago

The current behavior of the xarray's Groupby.groups property provides a standard (unordered) dictionary. This is fine for most cases but leads to odd orderings in use cases like this one where I am using xarray's FacetGrid plotting:

plot_kwargs = dict(col='season', vmin=15, vmax=35, levels=12, extend='both')

da_obs = ds_obs.SALT.isel(depth=0).groupby('time.season').mean('time')
da_obs.plot(**plot_kwargs)

index Note that MAM and JJA are out of order.

I think this could be easily fixed by using an OrderedDict in xarray.core.Groupby.groups.

shoyer commented 8 years ago

I agree this is annoying, but I don't think your diagnosis is correct here. The groups property isn't used by any internal routines AFAICT. The issue is that groups are sorted, but as text rather than ordered categorical -- notice that the labels are ordered alphabetically.

jhamman commented 8 years ago

Hmmm, a mystery. I'll look into this a bit more.

shoyer commented 8 years ago

For what it's worth, I don't think we have any good solutions short of adding our own array type do to Categorical in xarray. We could set sort=False in some cases when we call pd.factorize but that's not a great alternative.

tdihp commented 6 years ago

Ahh, so it's sorted, instead of keeping the original order.

I was expecting DataArray.groupby().reduce would work like np.apply_along_axis, and used the data of the result directly.

jbusecke commented 4 years ago

Just stumbled across this issue. Is there a recommended workaround?

I am usually doing this (specific to seasons):

import xarray as xr
ds = xr.tutorial.open_dataset('air_temperature')
airtemp_seasonal = ds.groupby('time.season').mean('time').sortby(xr.DataArray(['DJF','MAM','JJA', 'SON'],dims=['season']))

Thought this might help some folks who need to solve this problem.

dcherian commented 4 years ago

I use reindex instead of sortby

max-sixty commented 1 month ago

Closing as stale, please reopen with an updated summary if relevant

dcherian commented 1 month ago

Hehe, I'm so close to actually fixing this! :)