In the current version, eli5 does not allow NaNs in X. However, some models handle NaNs very well e.g. XGBClassifier or HistGradientBoostingClassifier from sklearn.
The following code allows to reproduce the issue:
import numpy as np
from eli5.sklearn import PermutationImportance
from sklearn.experimental import enable_hist_gradient_boosting
from sklearn.ensemble import HistGradientBoostingClassifier
X, y = load_iris(return_X_y=True)
X[0,0] = np.nan
perm = PermutationImportance(HistGradientBoostingClassifier(), cv=5)
perm.fit(X, y)
I suppose PermutationImportance should allow for NaNs, since if the used classifier does not allow for it, it will throw the same error anyway.
I would change the following line in eli5.sklearn.permutation_importance:
197: X = check_array(X)
into:
197: X = check_array(X, force_all_finite='allow-nan')
This would require increasing requirements sklearn to 0.20.0
Other way would be without modifying requirements
197: X = check_array(X, force_all_finite=False)
As part of the issue, I would also like to implement a quick unit test.
In the current version, eli5 does not allow NaNs in X. However, some models handle NaNs very well e.g. XGBClassifier or HistGradientBoostingClassifier from sklearn.
The following code allows to reproduce the issue:
I suppose PermutationImportance should allow for NaNs, since if the used classifier does not allow for it, it will throw the same error anyway.
I would change the following line in eli5.sklearn.permutation_importance:
into:
This would require increasing requirements sklearn to 0.20.0
Other way would be without modifying requirements
As part of the issue, I would also like to implement a quick unit test.