unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.05k stars 281 forks source link

Piping pandas with pandera schema doesn't raise SchemaError ( python 3.11.9 ) #1582

Closed RevengeComing closed 1 month ago

RevengeComing commented 1 month ago

Describe the bug On the newly released python version ( 3.11.9 ) pandas.DataFrame.pipe with pandera schema validation doesn't work as expected.

Honestly, I don't know if it is Pandas' error or Pandera's.

Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

from datetime import date, timedelta
import pandas as pd
import pandera as pa

from pandera.typing import DateTime, Series

class TestSchema(pa.SchemaModel):
    date_field: Series[DateTime] = pa.Field(coerce=True)
    str_field: Series[str] = pa.Field()
    int_field: Series[int] = pa.Field(ge=0, coerce=True)

df = pd.DataFrame.from_dict(
    {
        "date_field": [
            date.today(),
            date.today() - timedelta(days=1),
            date.today() - timedelta(days=2),
        ],
        "str_field": ["1", "2", "3"],
    }
)
# expecting pa.errors.SchemaError
df = df.pipe(pa.typing.DataFrame[TestSchema])

Expected behavior

I am expecting df.pipe(pa.typing.DataFrame[TestSchema]) to raise pa.errors.SchemaError.

Desktop (please complete the following information):

cosmicBboy commented 1 month ago

This was fixed in https://github.com/unionai-oss/pandera/pull/1561, see https://github.com/unionai-oss/pandera/issues/1559.

Next release coming out in a few weeks, should be available in the next beta release in a few days