unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.37k stars 310 forks source link

@pa.check gives pickle error when using along with pandarallel #1217

Open ragrawal opened 1 year ago

ragrawal commented 1 year ago

Describe the bug Adding @pa.check annotation raises 'cannot pickle classmethod object' error. The below code works fine without @pa.check annotation but fails when it is enabled.

Code Sample, a copy-pastable example

import pandera as pa
import pandera.typing as pat
import pandas as pd
import dill
from random import randint
from pandarallel import pandarallel
from tqdm import tqdm

tqdm.pandas()

pandarallel.initialize(progress_bar=True, use_memory_fs=False)

class InputSchema(pa.SeriesSchema):
    a: list[str]
    b: list[str]
    c: list[str]

@pa.check
def xyz(row: pat.DataFrame[InputSchema]) -> int:
    return randint(0, 5)

df = pd.DataFrame({
    'a': [['a', 'a'], ['b', 'b']],
    'b': [['a', 'a'], ['b', 'b']],
    'c': [['a', 'a'], ['b', 'b']],
})
out_df = df.parallel_apply(lambda x: xyz(x), axis=1)
print(out_df)
cosmicBboy commented 1 year ago

what's the @pa.check decorator doing there? did you mean to use @pa.check_types? @pa.check is a method decorator for the DataFrameModel subclass methods.