ENH: allow dict to name columns for multiple-aggregations in groupby

jreback commented 10 years ago

from SO:

http://stackoverflow.com/questions/22115671/apply-different-resampling-method-to-the-same-column-pandas

related #7700 (make sure to have name support)

Might be nice to support a dict for how on resample/groupby.agg so that the resulting columns of an aggregation are named as the keys

df.resample('A',how={ 'mymax' : 'max', 'mymean' : 'mean'})

In [23]: df = DataFrame(np.random.randn(100,1),columns=['weight'],index=date_range('20000101',periods=100,freq='MS'))

In [24]: df.resample('A',how=['max','mean'])
Out[24]: 
              weight          
                 max      mean
2000-12-31  1.958570 -0.312230
2001-12-31  1.739518  0.035701
2002-12-31  2.503437  0.169365
2003-12-31  1.115315  0.149279
2004-12-31  2.190617 -0.087536
2005-12-31  1.286224  0.037669
2006-12-31  1.674017  0.147676
2007-12-31  2.107169 -0.064962
2008-12-31 -0.163863 -0.572363

[9 rows x 2 columns]

jorisvandenbossche commented 10 years ago

This would be useful, but two things (and this appies also to the general groupby case):

this already works when having a Series. In this case:

df['weight'].resample('A', how={'mymax': 'max', 'mymean': 'mean'})

for the dataframe case, I think this conflicts with the ability to apply different functions to different columns

In [52]: df['col2'] = np.arange(len(df))
In [53]: df.resample('A', how={'weight': 'max', 'col2':'mean'})
Out[53]:
      col2    weight
2000-12-31   5.5  1.118113
2001-12-31  17.5  1.842229
2002-12-31  29.5  2.345190
2003-12-31  41.5  1.914983
2004-12-31  53.5  2.338382
2005-12-31  65.5  2.324127
2006-12-31  77.5  2.142181
2007-12-31  89.5  0.986439
2008-12-31  97.5  1.576487

jorisvandenbossche commented 7 years ago

@jreback I think this can be closed? As resample follows now the groupby pattern, and there such usage of dictionaries is (clearly or not clearly :-)) defined.

jreback commented 7 years ago

sure, I think this is covered.

pandas-dev / pandas

ENH: allow dict to name columns for multiple-aggregations in groupby #6515