MLBazaar / AutoBazaar

AutoBazaar: An AutoML System from the Machine Learning Bazaar
https://mlbazaar.github.io/AutoBazaar/
MIT License
32 stars 12 forks source link

Uncaught ValueError using imputer on grub_damage #6

Closed micahjsmith closed 5 years ago

micahjsmith commented 5 years ago

Uncaught ValueError when running search on sample dataset.

$ abz search -b 3 LL0_1026_grub_damage               
Using TensorFlow backend.                                                                                
Processing Datasets: ['LL0_1026_grub_damage']       
###################                                                                                      
#### Searching ####                                 
###################                                                                                      
WARNING: Logging before flag parsing goes to stderr. 
E0826 17:46:01.013370 4661482944 mlpipeline.py:223] Exception caught fitting MLBlock sklearn.preprocessing.Imputer#1
Traceback (most recent call last):                  
  File "/usr/local/miniconda3/envs/autobazaar/lib/python3.6/site-packages/mlblocks/mlpipeline.py", line 221, in fit
    block.fit(**fit_args)                                                                                
  File "/usr/local/miniconda3/envs/autobazaar/lib/python3.6/site-packages/mlblocks/mlblock.py", line 246, in fit                
    getattr(self.instance, self.fit_method)(**fit_args)
  File "/usr/local/miniconda3/envs/autobazaar/lib/python3.6/site-packages/sklearn/preprocessing/imputation.py", line 158, in fit
    force_all_finite=False)                                                                              
  File "/usr/local/miniconda3/envs/autobazaar/lib/python3.6/site-packages/sklearn/utils/validation.py", line 527, in check_array
    array = np.asarray(array, dtype=dtype, order=order)
  File "/usr/local/miniconda3/envs/autobazaar/lib/python3.6/site-packages/numpy/core/numeric.py", line 501, in asarray
    return array(a, dtype, copy=False, order=order) 
ValueError: could not convert string to float: '6f' 
csala commented 5 years ago

Thanks for reporting this @micahjsmith

However, this is an expected error message which has no further consequences than just that, the reporting. This happens because the encode argument from Featuretools is a tunable hyperparameter, and when it is False and the original dataset has strings, the next primitive fails.

And the traceback is the default MLBlocks logging, which reports the primitive exceptions.

micahjsmith commented 5 years ago

I see, thanks for the explanation!