Closed thatlittleboy closed 1 year ago
The reason is because the __call__()
converts the datafame to numpy array too eagerly, see L213, before passing to the shap_values
method in L218.
When lightgbm receives a numpy array, it just assumes it is an array of floats.
lightgbm actually has a dedicated function _data_from_pandas
to prepare the input pandas dataframe into an input ready for the LightGBM model.
Which is only called when lightgbm
receives a pandas DataFrame.
In particular, categoricals are carefully encoded in this lightgbm
function, which we miss out on doing if we just called X.values
directly in L213.
I will push a fix later this week.
Related issue slundberg#2144.
LightGBM errors out if we call
explainer(X)
, but does not error if we callexplainer.shap_values(X)
. Even thoughexplainer(X)
itself callsexplainer.shap_values(X)
internally.Reproducible example
Error message:
Expected result is to not throw an error.