HealthCatalyst / healthcareai-py

Python tools for healthcare machine learning
http://healthcare.ai
MIT License
309 stars 186 forks source link

Models trained with no grain column fail on making predictions #460

Closed Aylr closed 6 years ago

Aylr commented 6 years ago

Implementation

Error

ValueError                                Traceback (most recent call last)
<ipython-input-27-d944250ad825> in <module>()
----> 1 rf.make_predictions(df)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\healthcareai\trained_models\trained_supervised_model.py in make_predictions(self, dataframe)
    183 
    184         result = pd.DataFrame({
--> 185             self.grain_column: dataframe[self.grain_column].values,
    186             'Prediction': None,
    187             'Probability': None,

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   1962             return self._getitem_multilevel(key)
   1963         else:
-> 1964             return self._getitem_column(key)
   1965 
   1966     def _getitem_column(self, key):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _getitem_column(self, key)
   1969         # get column
   1970         if self.columns.is_unique:
-> 1971             return self._get_item_cache(key)
   1972 
   1973         # duplicate columns & possible reduce dimensionality

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in _get_item_cache(self, item)
   1643         res = cache.get(item)
   1644         if res is None:
-> 1645             values = self._data.get(item)
   1646             res = self._box_item_values(item, values)
   1647             cache[item] = res

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\internals.py in get(self, item, fastpath)
   3597                         loc = indexer.item()
   3598                     else:
-> 3599                         raise ValueError("cannot label index with a null key")
   3600 
   3601             return self.iget(loc, fastpath=fastpath)

ValueError: cannot label index with a null key
danwellisch1 commented 6 years ago

Taylor:

See Pull Request: Grain column vector must exist in order to add it to results. #453

This may fix your issue, as I ran into this when working on the issue to make the grain column optional.

Dan

Aylr commented 6 years ago

Fixed in branch 372 due to urgent need. https://github.com/HealthCatalyst/healthcareai-py/commit/197b4499b6d6d9bc5de38c284c423a8bdf8428c5