Open zbarry opened 5 years ago
@Zsailer - what do you think about such a capability in PF?
@zbarry @ericmjl @pyjanitor-devs/core-devs how can we make this possible? is this even possible?
one way about this is with a summarise
function, that has a by
parameter, and within that function we can do all the magic
within it. inspired by the update to the summarise
feature coming in dplyr 1.1, and rdatatable
and pydatatable
use of by
.
df.summarise(col_name = func or arg name, by = func or kwargs)
We can even make it such that you can filter within a groupby effectively (maybe?)
It would be nice to be able to add functionality to the Pandas
GroupBy
objects:GroupBy
,DataFrameGroupBy
,SeriesGroupBy
. There's no convenient accessor interface to do this, but maybe there's a way to reliably monkeypatch them. This would allow us to create nifty aggregation / apply functions and avoid the.groupby(...).apply()
route for tasks we may encounter routinely. It could also potentially open up opportunities to speed up such operations....groupby().apply()
can often be slow for large numbers of groups.