Closed MO105 closed 4 years ago
@MO105 A coulpe notes:
MultipleImputer
returns a generator
. You can return a list with by setting the return_list
argument. return_list=True
. There's an example in the README.median
, Median imputation is indempotent, so no need to use MultipleImputer
which will do more work than SingleImputer
.sklearn
pipelines can be tricky. It works if you are imputing with univariate methods, but it may not work if you're doing something like MICE
, because we'd use the target (y
) in the imputation process. The target is separated out in sklearn
pipelines.Closing this for now as I believe this answers your questions. Let me know if any others!
I wanted to integrate auto impute with Sklearn pipeline as seen here :
https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html
However, I get the error
AttributeError: 'generator' object has no attribute 'size'
when I try and substituteSimpleImputer(strategy='median')
withMultipleImputer()
.edit : from what I can understand Sklearn
imputer.fit_transform(X)
returns an array whereas auto impute returns a generator object, which can then be fed into a DataFrame .Thanks