iskandr / fancyimpute

Multivariate imputation and matrix completion algorithms implemented in Python
Apache License 2.0
1.25k stars 177 forks source link

[MICE] ValueError: Must have equal len keys and value when setting with an iterable #39

Closed joshjacobson closed 7 years ago

joshjacobson commented 7 years ago
from fancyimpute import MICE
X_filled_mice = MICE().complete(X_incomplete)
[MICE] Completing matrix with shape (902, 368)
[MICE] Starting imputation round 1/110, elapsed time 0.009
....
[MICE] Starting imputation round 110/110, elapsed time 1079.970
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-77-b8ec27551960> in <module>()
      3 
      4 
----> 5 X_filled_mice = MICE().complete(X_incomplete)

C:\Users\__\Anaconda3\lib\site-packages\fancyimpute\mice.py in complete(self, X)
    335         # average the imputed values for each feature
    336         average_imputated_values = imputed_arrays.mean(axis=0)
--> 337         X_completed[missing_mask] = average_imputated_values
    338         return X_completed

C:\Users\__\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
   2324 
   2325         if isinstance(key, (Series, np.ndarray, list, Index)):
-> 2326             self._setitem_array(key, value)
   2327         elif isinstance(key, DataFrame):
   2328             self._setitem_frame(key, value)

C:\Users\__\Anaconda3\lib\site-packages\pandas\core\frame.py in _setitem_array(self, key, value)
   2344             indexer = key.nonzero()[0]
   2345             self._check_setitem_copy()
-> 2346             self.loc._setitem_with_indexer(indexer, value)
   2347         else:
   2348             if isinstance(value, DataFrame):

C:\Users\__\Anaconda3\lib\site-packages\pandas\core\indexing.py in _setitem_with_indexer(self, indexer, value)
    577 
    578                     if len(labels) != len(value):
--> 579                         raise ValueError('Must have equal len keys and value '
    580                                          'when setting with an iterable')
    581 

ValueError: Must have equal len keys and value when setting with an iterable
sergeyf commented 7 years ago

Any chance you could upload some example data so I could reproduce this on my side?

joshjacobson commented 7 years ago

Yep, unzip the attached file Vote_Selling_Survey_Data.dta.zip or download it from here: https://www.aeaweb.org/aer/data/10505/P2015_1033_data.zip

And then this code will raise the error:

import pandas as pd
from fancyimpute import MICE
X_incomplete = pd.read_stata('Vote_Selling_Survey_Data.dta')
X_incomplete = pd.get_dummies(X_incomplete)
X_filled_mice = MICE(n_imputations=1).complete(X_incomplete)
sergeyf commented 7 years ago

Oh it's that thing where we expect a numpy array. Just do this instead:

X_filled_mice = MICE(n_imputations=1).complete(X_incomplete.values)

I'll change it in the code so it converts all of the inputs into numpy arrays.