nalepae / pandarallel

A simple and efficient tool to parallelize Pandas operations on all available CPUs
https://nalepae.github.io/pandarallel
BSD 3-Clause "New" or "Revised" License
3.59k stars 208 forks source link

parallel_apply not working with Pandas >= 2.1 #254

Open masc-it opened 8 months ago

masc-it commented 8 months ago

General

Bug description

parallel_apply is not working when using pandas >= 2.1. In my case, I am using it after a groupby.

Observed behavior

Progress bar doesn't show up, the processing seems to be run sequentially (according to Activty Monitor).

perveen-shaheen commented 7 months ago

I am experiencing the exact issue:

nalepae commented 5 months ago

Pandaral·lel is looking for a maintainer! If you are interested, please open an GitHub issue.

shermansiu commented 2 months ago

It works just fine for me on Pandas 2.1. Do you have a minimal code example to reproduce your bug?

Python: 3.10.13 Pandarallel: 1.6.5 Pandas: 2.1.0 Ubuntu 22.04

import pandas as pd
import pandarallel

pandarallel.pandarallel.initialize(nb_workers=2, progress_bar=True)

df = pd.DataFrame({"foo": range(200), "bar": range(200, 400)})
df["even"] = df["foo"] % 2 == 0
assert df.groupby("even").apply(lambda x: x+1).equals(df.groupby("even").parallel_apply(lambda x: x+1))