Describe the bug
In a SchemaModel without a config class / coerce=False, a dtype mismatch will throw a validation error.
With coerce=True, no validation error is thrown, but the dtypes are not coerced either:
Code Sample, a copy-pastable example
This is adapted from the SchemaModel Validate on Initialization example.
import pandera as pa
from pandera.typing import DataFrame, Series
class Schema(pa.SchemaModel):
state: Series[str]
city: Series[str]
price: Series[float] = pa.Field(in_range={"min_value": 5, "max_value": 20})
class Config:
coerce = True # with this False, get pandera.errors.SchemaError:
# expected series 'price' to have type float64, got int64
df = DataFrame[Schema](
{
'state': ['NY','FL','GA','CA'],
'city': ['New York', 'Miami', 'Atlanta', 'San Francisco'],
'price': [8, 12, 10, 16],
}
)
print(pa.__version__)
print(df.dtypes, "\n")
Schema.validate(df, inplace=True)
print("This is what I would expect to see")
print(df.dtypes)
###################
0.9.0
state object
city object
price int64
dtype: object
This is what I would expect to see
state object
city object
price float64
dtype: object
Expected behavior
Dtype coercion to actually happen without an additional call to validate.
Describe the bug In a SchemaModel without a config class / coerce=False, a dtype mismatch will throw a validation error. With coerce=True, no validation error is thrown, but the dtypes are not coerced either:
Code Sample, a copy-pastable example
This is adapted from the SchemaModel Validate on Initialization example.
Expected behavior
Dtype coercion to actually happen without an additional call to validate.
Additional context
I think this could be fixed by looking at the schemamodel config and passing through coerce here: https://github.com/pandera-dev/pandera/blob/f36cc9befc1f83a22cf08a725285cf388e3a04fb/pandera/typing/common.py#L167-L169 but I haven't yet looked to see if that has flow on effects.