Open AnotherSamWilson opened 2 years ago
After some experimenting, I see that setting the datatype to "object" fixes this. Still, it's probably a good idea to allow categorical data types to be found by whatever categorical imputer the user selects.
agreed that pd.Categorical
should be handled. I'd assume in this case we'd treat them the same as objects
for which we implicitly assume categories (generally the objects
are str
). Let me know if you see any issue with that or if you'd expect the categorical type to be handled differently.
See the following example:
The
target
column is not imputed in the transform. I do see that the default value for strategy is to use the "predictive default" imputer, which ends up being PMM for numeric columns and multinomial logistic for categorical columns. I would think that categories would be imputed by default. Is there a bug, or some setting I am not aware of?