AttributeError: 'DataFrameGroupBy' object has no attribute 'parallel_apply'

beyondguo commented 7 months ago

General

Operating System: ubuntu
Python version: 3.8
Pandas version: 2.0.3
Pandarallel version: 1.6.5

Acknowledgement

[ *] My issue is NOT present when using pandas without alone (without pandarallel)
[ *] If I am on Windows, I read the Troubleshooting page before writing a new bug report

Bug description

sentiment_df.groupby('scode').parallel_apply(lambda x: x['f'].rolling(window=window_size, min_periods=1).apply(func, raw=True))

Observed behavior

AttributeError: 'DataFrameGroupBy' object has no attribute 'parallel_apply' Write here the observed behavior

Expected behavior

Write here the expected behavior

Minimal but working code sample to ease bug fix for `pandarallel` team

nalepae commented 5 months ago

Pandaral·lel is looking for a maintainer! If you are interested, please open an GitHub issue.

shermansiu commented 2 months ago

I typed up the above code example.

import pandas as pd
import time
from pandarallel import pandarallel
import math
import numpy as np

df_size = int(3e7)
df = pd.DataFrame(dict(a=np.random.randint(1, 1000, df_size),
                       b=np.random.rand(df_size)))

def func(df):
    dum = 0
    for item in df.b:
        dum += math.log10(math.sqrt(math.exp(item**2)))
    return dum / len(df.b)

res_parallel = df.groupby("a").parallel_apply(func)

It works just fine for me.

Python: 3.10.13 Pandarallel: 1.6.5 Pandas: 2.1.0

nalepae / pandarallel