TeamHG-Memex / eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
MIT License
2.74k stars 332 forks source link

This is a bug on xgboost. #392

Open Columbine21 opened 3 years ago

Columbine21 commented 3 years ago

Description: XGBClassifier explainer failed when n_estimators=1.(extreme situation) E.g. model_t = XGBClassifier(random_state=1111, max_depth=4, n_estimators=1)

show_prediction(model, test_input[tougue_correct_q[0]]) will failed to run.

The error message is as follows:

截屏2020-10-21 上午9 18 32
pjgao commented 3 years ago

same errors, the problem lies in :

def _prediction_feature_weights(booster, dmatrix, n_targets,
                                feature_names, xgb_feature_names):
    """ For each target, return score and numpy array with feature weights
    on this prediction, following an idea from
    http://blog.datadive.net/interpreting-random-forests/
    """
    # XGBClassifier does not have pred_leaf argument, so use booster
    leaf_ids, = booster.predict(dmatrix, pred_leaf=True)

booster.predict(dmatrix, pred_leaf=True) will return a 1-d array when only one tree in xgboost.
this may be modified to:

def _prediction_feature_weights(booster, dmatrix, n_targets,
                                feature_names, xgb_feature_names):
    """ For each target, return score and numpy array with feature weights
    on this prediction, following an idea from
    http://blog.datadive.net/interpreting-random-forests/
    """
    # XGBClassifier does not have pred_leaf argument, so use booster
    leaf_ids, = booster.predict(dmatrix, pred_leaf=True).reshape(1,-1)
pjgao commented 3 years ago

Now the question is: Is dmatrix only contains one sample ?