unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.27k stars 305 forks source link

`pandera.check_types` causes `ValueError: no results` for functions called by `pandas.DataFrame.apply` #1021

Closed bryngarrod-habitatenergy closed 1 year ago

bryngarrod-habitatenergy commented 1 year ago

If the @pandera.check_types decorator is used for a function that is called by pandas.DataFrame.apply, it causes ValueError: no results.

Code Sample

import pandas as pd
import pandera as pa

class SimpleSchema(pa.SchemaModel):
    x: pa.typing.Series[str] = pa.Field(description="String field")
    y: pa.typing.Series[int] = pa.Field(description="Integer field")

@pa.check_types
def series_to_frame(df_row: pd.Series) -> pa.typing.DataFrame[SimpleSchema]:
    output_row = pd.DataFrame({"x": [df_row.x], "y": [df_row.y]})
    return output_row

if __name__ == "__main__":
    simple_df = pd.DataFrame({"x": ["a", "b", "c"], "y": [1, 2, 3]})

    version1 = pd.concat([series_to_frame(row) for _, row in simple_df.iterrows()])
    print("Version 1:")
    print(version1)

    version2 = pd.concat(simple_df.apply(series_to_frame, axis=1).values.tolist())
    print("Version 2:")
    print(version2)

    print("Versions 1 and 2 equal:")
    print(version1.equals(version2))

Expected behaviour

It should run without errors and the final check should return True.

Currently, only the iterrows Version 1 runs without errors. If @pa.check_types is removed then the whole script runs without errors and the final check returns True.

Desktop

cosmicBboy commented 1 year ago

hey @bryngarrod-habitatenergy fun bug! Basically @check_types uses the wrapt library, which returns a FunctionWrapper that somehow defines an __iter__ argument. This confuses pandas into thinking it's a list of aggregation functions.

1022 should address this issue... would you mind adding a regression test to this PR? Basically the code snippet above should do.

Test case should probably live in this file: https://github.com/unionai-oss/pandera/blob/main/tests/core/test_decorators.py