m3dev / gokart

Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
https://gokart.readthedocs.io/en/latest/
MIT License
305 stars 57 forks source link

Draft: Dump with Type Check #360

Open dn070017 opened 6 months ago

dn070017 commented 6 months ago

Add new property expected_output_dataframe_type for gokart.TaskOnKart which allows the users to provide the Pandera schema. If expected_output_dataframe_type is provided, when loading (load) and storing (dump) the cached dataframe, it will automatically check if the dataframe meet the schema defined in expected_output_dataframe_type. See test/test_task_on_kart.py for example usage.

[must] fix error from mypy

hirosassa commented 6 months ago

@dn070017 Please check mypy failure.