danielhomola / mifs

Parallelized Mutual Information based Feature Selection module.
BSD 3-Clause "New" or "Revised" License
288 stars 110 forks source link

ValueError: numpy.nanargmax raises on a.size==0 and axis=None; So Bottleneck too. #25

Open samwisehawkins opened 5 years ago

samwisehawkins commented 5 years ago

Any attempt to call fit() raises an error ValueError: numpy.nanargmax raises on a.size==0 and axis=None; So Bottleneck too.

import numpy as np
import mifs

X = np.random.random(size=100).reshape((25,4))*100
y = np.random.random(size=25)*100

print(X.shape, y.shape)
sel = mifs.MutualInformationFeatureSelector(method='JMI', categorical=False)
sel.fit(X, y)

(25, 4) (25,)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-87-47bba24c3c23> in <module>()
  7 print(X.shape, y.shape)
  8 sel = mifs.MutualInformationFeatureSelector(method='JMI', categorical=False)
----> 9 sel.fit(X, y)

c:\users\a7slha\code\mifs\mifs\mifs.py in fit(self, X, y)
147             self.n_jobs = NUM_CORES - self.n_jobs
148 
--> 149         return self._fit(X, y)
150 
151 

c:\users\a7slha\code\mifs\mifs\mifs.py in _fit(self, X, y)
242             fmm = feature_mi_matrix[:len(S), F]
243             if self.method == 'JMI':
--> 244                 selected = F[bn.nanargmax(bn.nansum(fmm, axis=0))]
245             elif self.method == 'JMIM':
246                 if bn.allnan(bn.nanmin(fmm, axis = 0)):

ValueError: numpy.nanargmax raises on a.size==0 and axis=None; So Bottleneck too.
samwisehawkins commented 5 years ago

OK, I realise this is because I did not specify n_features,. But perhaps this should be raised earlier and less cryptically?

qnoirhomme commented 5 years ago

I had a similar error. It is because when I do not specify the number of features I want to keep n_features is set to np.inf at line 233 in mifs.py, and n_features is used to stop the main loop. Therefore, it crashes after selecting the latest possible feature. This could be corrected by changing line 233? if self.n_features == 'auto': n_features = p