vaexio / vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
https://vaex.io
MIT License
8.22k stars 590 forks source link

[FEATURE-REQUEST] Aggregate dict on a groupby #2272

Open vikalfc opened 1 year ago

vikalfc commented 1 year ago

When #2032 is done, it would be useful to have something like this

import vaex

df = vaex.datasets.titanic()

def custom_f(x,y):
    return {"name": x,"cabin": y}

df["content"]=df.apply(custom_f,
    arguments=[df["name"], df["cabin"]])
df_pandas=df.to_pandas_df()

df_pandas.groupby(by=['pclass', 'survived']).agg({
    'content': list}).reset_index()
Screenshot 2022-11-18 at 12 28 13