Closed waylonflinn closed 9 years ago
Oooh nice; the integration fails because of this btw: https://github.com/Blosc/bcolz/issues/251 if the tests run on your computer, it's fine ;)
One small other remark; the way in which it's currently setup was not exactly how I meant it to be. The idea was this (as in the documentation on the main page):
# groupby column f0, perform a sum on column f2 and keep the output column with the same name
ct.groupby(['f0'], ['f2'])
# groupby column f0, perform a sum on column f2 and rename the output column to f2_sum
ct.groupby(['f0'], [['f2', 'f2_sum']])
# groupby column f0, with a sum on f2 ('f2_sum') and a sum_na on f2 ('f2_sum_na')
ct.groupby(['f0'], [['f2', 'f2_sum', 'sum'], ['f2', 'f2_sum_na', 'sum_na']])
So the idea is:
So with the third type you can do multiple aggregations on one column (sum, mean, max, etc) all in one go. But the implementation is not nice like this yet :/ Also: if you find this not a very nice method, i'm completely open to other suggestions. The idea's based a bit on classical sql groupby options
Thanks for the heads up regarding the install bug!
I'm making a small change to make sure it works as it should with aggregation and output columns + adding something to your documentation (filling in the blanks that I can understand mystified you hehe ;)
I'm creating a PR for adding calculation of mean and standard deviation. This PR contains documentation updates and variable renames to make that process easier.