danielhomola / mifs

Parallelized Mutual Information based Feature Selection module.
BSD 3-Clause "New" or "Revised" License
289 stars 111 forks source link

HELP!! All-NaN slice encountered #15

Open kroscek opened 7 years ago

kroscek commented 7 years ago

Using continuous variable, here I encountered error that return NaN slice.

File "", line 1, in feat_selector.fit(X_train, y_train)

File "/home/lemma/miniconda2/lib/python2.7/site-packages/mifs-0.0.1.dev0-py2.7.egg/mifs/mifs.py", line 149, in fit return self._fit(X, y) File "/home/lemma/miniconda2/lib/python2.7/site-packages/mifs-0.0.1.dev0-py2.7.egg/mifs/mifs.py", line 223, in _fit S, F = self._add_remove(S, F, bn.nanargmax(xy_MI))

ValueError: All-NaN slice encountered. My data looks like: image

Not sure what's going on, although the code examples provided was running perfectly.

bnder321 commented 7 years ago

I also encountered same problem when choosing JMI option. can anyone help?

danielhomola commented 6 years ago

Can you please try the latest version of the code and report back if you still encounter the bug? Thanks!

vafisher commented 6 years ago

Hi, I am encountering this same bug in a newly installed version, under both Python 3.6.2 and 2.7.13 Any suggestions for how I might troubleshoot? Thanks!

doomers commented 6 years ago

I am getting the same error under python 3.6.3 and the last version of mifs, what possibly i am doing wrong? Thanks

bioinfoMMS commented 6 years ago

I am also receiving the same error with the latest version of mifs, and it appears that the xy_MI matrix is being filled in with NaNs, because the MI values from the _mi_cc function are all negative. Any help to fix this would be much appreciated, thank you!

imranahmed96 commented 6 years ago

^ Likewise, also working with continuous target variable.

Most likely an error with the following source gist: https://gist.github.com/GaelVaroquaux/ead9898bd3c973c40429

danielhomola commented 6 years ago

Anyone has some time to actually submit a pr for this? I don't have time to look into this..

imranahmed96 commented 6 years ago

My guess is as follows:

` return (d*np.mean(np.log(r + np.finfo(X.dtype).eps))

If r array values are < 1 (which is likely if X is high dimensional and the inputs are sparse) then this value above could return a negative number). This may be completely wrong though...

bhpiamnothing commented 4 years ago

I have the same problem. Can someone tell me what to do?

Icebreaker001 commented 4 years ago

我也有同样的问题。有人可以告诉我该怎么做吗?

Do you have solved this problem?Thanks

Jiang2019Code commented 3 years ago

I use a replace method to solve this problem

vectorArray=np.concatenate(vectorList,axis=0)#your array whereNan = np.where(np.isnan(np.sum(vectorArray, axis=0)))#find where nan vectorArray[np.isnan(vectorArray)] = 0 #replace the nan vectorResult=np.nanargmax(vectorArray,axis=0).astype(np.float)#do something vectorResult[whereNan] = np.nan #replace the nan again

ddofer commented 3 years ago

Same issue, latest version of mids and sklearn, discrete (0/1) target. Input features are a mix of continous and discrete, after scaling and nan imputation. Happens when using any of the criterions ("JMI", "JMIM", MRMR)

callmedrcom commented 2 years ago

I think this problem come from the calculation of mutual information, which this part may original from: https://github.com/mutualinfo/mutual_info/blob/main/mutual_info/mutual_info.py replace the relative function will sovle this problem.