Broken pipe error when Groupby on Timestamp for a data with ~200 million rows

The following statement works fine for a sample of rows (let's say 100,000) but when I run it on the whole data (~200 million), I get a broken pipe error, due to excessive usage of CPU and memory.

df2= df.groupby(vaex.BinnerTime.per_week(df.TIMESTAMP)).agg({'index' : 'count'})

The exact error is Errno 32: Broken pipe error from multiple pool worker Process ForkPoolWorker-23:

Additionally, I am seeing the error KeyError: "Unknown variables or column: 'lambda_function(__TIMESTAMP)'". It works fine with the sample data. Is it possible that column TIMESTAMP is creating some issue?

I can solve this issue by splitting the data but is there any other fix that can be used to deal with my whole data at once.

vaexio / vaex

Broken pipe error when Groupby on Timestamp for a data with ~200 million rows #2352