kieferk / dfply

dplyr-style piping operations for pandas dataframes
GNU General Public License v3.0
889 stars 103 forks source link

Metadata saved after group_by operation #49

Closed CedricFR closed 6 years ago

CedricFR commented 6 years ago

The goal of this pull request is to allow metadata parameters to be kept while doing group by operations.

It can be used for example when having a plot pipeline, to store information about plot configuration until we show it (with the spirit of ggplot/plotly pipelines in R)

The _group_by property could then also be registered as a dataframe metadata, which would simplify the copy of dataframes in dfply.

More about _metadata in pandas: https://pandas.pydata.org/pandas-docs/stable/internals.html#define-original-properties

kieferk commented 6 years ago

I like it. Thanks for this! Merging now. I will think about the metadata _groupe_by suggestion and see if I can come up with something clever.