lvgig / tubular

Python package implementing transformers for pre processing steps for machine learning.
https://tubular.readthedocs.io/en/latest/index.html
BSD 3-Clause "New" or "Revised" License
38 stars 14 forks source link

[Bug]: DatetimeInfoExtractor prevents pipeline from being saved using joblib or pickle #257

Closed Decima2014 closed 3 months ago

Decima2014 commented 3 months ago

What happened?

The DatetimeInfoExtractor stores the mappings_provided attribute as a .keys() type. If you try and save a pipeline object which has that pipeline step using joblib or pickle you get an error stating that:

"TypeError: cannot pickle 'dict_keys' object".

This attribute should be stored as a list instead. We should maybe also add a test that our transformers can be saved as a pickle file

Environment

scikit-learn == 1.1.3 Python == 3.9.13 joblib == 1.3.2 tubular == 1.1.0

Minimum reproducible code

No response

Relevant error output

TypeError: cannot pickle 'dict_keys' object

Code of Conduct

davidhopkinson26 commented 3 months ago

Thanks for raising this issue. Should be a simple fix to switch this to a list and I agree that adding a test we can pickle the objects should be added to generic tests.

The test class base_tests.OtherBaseBehaviourTests was created for this purpose and should be inherited by all transformer test modules.