Pandera can not only be used to validate the Dataframe but also to convert the dtypes in the Dataframe accoding to the schema.
The schema.validate function returns the validated Dataframe with the converted dtypes. When can update the input dataframe with the validated dataframe so in the nodes we will get a validated and converted dataframe accorting to the schema.
Context
Possible Implementation
Add additional configuration parameter which allows per dataset to define if only want to validate or also to convert the dataset.
If it is also configuted to convert the dataset we can forward the converted the dataset in the hook.
A global parameter can be defined which allows to specify the default behaviout for all datasets which use a pandera schema.
Description
Pandera can not only be used to validate the Dataframe but also to convert the dtypes in the Dataframe accoding to the schema.
The schema.validate function returns the validated Dataframe with the converted dtypes. When can update the input dataframe with the validated dataframe so in the nodes we will get a validated and converted dataframe accorting to the schema.
Context
Possible Implementation
Add additional configuration parameter which allows per dataset to define if only want to validate or also to convert the dataset. If it is also configuted to convert the dataset we can forward the converted the dataset in the hook.
A global parameter can be defined which allows to specify the default behaviout for all datasets which use a pandera schema.