unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.38k stars 310 forks source link

SchemaModel ignores coerce=True in initialiser constructor #769

Closed m-richards closed 2 years ago

m-richards commented 2 years ago

Describe the bug In a SchemaModel without a config class / coerce=False, a dtype mismatch will throw a validation error. With coerce=True, no validation error is thrown, but the dtypes are not coerced either:

Code Sample, a copy-pastable example

This is adapted from the SchemaModel Validate on Initialization example.

import pandera as pa
from pandera.typing import DataFrame, Series

class Schema(pa.SchemaModel):
    state: Series[str]
    city: Series[str]
    price: Series[float] = pa.Field(in_range={"min_value": 5, "max_value": 20})

    class Config:
        coerce = True  # with this False, get pandera.errors.SchemaError:
    #     expected series 'price' to have type float64, got int64

df = DataFrame[Schema](
    {
        'state': ['NY','FL','GA','CA'],
        'city': ['New York', 'Miami', 'Atlanta', 'San Francisco'],
        'price': [8, 12, 10, 16],
    }
)
print(pa.__version__)
print(df.dtypes, "\n")

Schema.validate(df, inplace=True)
print("This is what I would expect to see")
print(df.dtypes)
###################
0.9.0
state    object
city     object
price     int64
dtype: object 

This is what I would expect to see
state     object
city      object
price    float64
dtype: object

Expected behavior

Dtype coercion to actually happen without an additional call to validate.

Additional context

I think this could be fixed by looking at the schemamodel config and passing through coerce here: https://github.com/pandera-dev/pandera/blob/f36cc9befc1f83a22cf08a725285cf388e3a04fb/pandera/typing/common.py#L167-L169 but I haven't yet looked to see if that has flow on effects.

cosmicBboy commented 2 years ago

thanks for filing this bug @m-richards !

https://github.com/pandera-dev/pandera/pull/772 should address this, mind giving it a quick glance? I basically used your code snippet in the test case.

cosmicBboy commented 2 years ago

fixed by #772