unionai-oss / pandera

A light-weight, flexible, and expressive statistical data testing library
https://www.union.ai/pandera
MIT License
3.27k stars 305 forks source link

pass date format for coerce date #771

Closed telferm57 closed 2 years ago

telferm57 commented 2 years ago

Hi,

is it possible to pass a date format to the date coercion function ? For example, I would like to specify format='%d/%M/%Y' or dayfirst=True in the DataFrameSchema

jeffzi commented 2 years ago

Hi @telferm57

There is an undocumented pandas-specific DateTime type that can receive coercion parameters via its to_datetime_kwargs argument, then forwarded to pandas.to_datetime.

@cosmicBboy I can submit a PR to add it to the documentation under Pandas-specific Dtypes.

import pandas as pd
import pandera as pa
from pandera.engines import pandas_engine
from pandera.typing import Series

schema = pa.DataFrameSchema(
    {
        "dt": pa.Column(
            pandas_engine.DateTime(to_datetime_kwargs={"format": "%d/%m/%Y"})
        )
    },
    coerce=True,
)

df = pd.DataFrame({"dt": ["28/02/2022"]})
schema.validate(df)
#>           dt
0 2022-02-28

# Model API
class Schema(pa.SchemaModel):
    dt: Series[pandas_engine.DateTime] = pa.Field(
        dtype_kwargs={"to_datetime_kwargs": {"format": "%d/%m/%Y"}}
    )

    class Config:
        coerce = True

Schema.validate(df)
#>           dt
0 2022-02-28
cosmicBboy commented 2 years ago

@cosmicBboy I can submit a PR to add it to the documentation under Pandas-specific Dtypes.

This would be awesome! feel free to re-purpose this issue to link your PR onto

telferm57 commented 2 years ago

Wow, that's what I call a quick response ! Thank you!

cosmicBboy commented 2 years ago

fixed by #780