Explainer for ranking task (lambdamart)

shap / shap

A game theoretic approach to explain the output of any machine learning model.

https://shap.readthedocs.io

MIT License

22.89k stars 3.29k forks source link

Explainer for ranking task (lambdamart) #570

Open doramir opened 5 years ago

doramir commented 5 years ago

I'm using XGBoost with lambdaMart as objective (rank:pairwise) the problem with this model is the prediction of the model is a rank for each group and the score the model is giving for each item in a group is relevant only for the ranking inside the group and don't mean anything outside the group.

as far as I know, SHAP is not doing group wise (listwise) prediction and use each item (row) as an individual and not as part of a group, and tries to predict the why the model gave this value to this item.

is there a way to understand why the model gives this ranking to this group of items? why he put item x before item y and etc?

this problem is relevant to all ranking tasks, XGBoost, lightGBM, and CatBoost as well.

kretes commented 5 years ago

@doramir AFAIK this field is still to be explored. One of the methods that was proposed is this paper: https://arxiv.org/abs/1809.03857 If you find something relevant for the topic - please share it as well

TeodoraVrabcheva commented 4 years ago

Any development on this? I am also using a XGBoost with lambdaMart as objective to rank results in user queries and I want to extract the local feature importance with SHAP. Given SHAP is not doing a groupwise prediction I took an approach where I defining the explainer ONLY on the data containing the results from query that includes the item I am trying to explain.

# data_for_one_query_df is the data containing the results from the query that includes the item I am trying to explain
explainer = shap.TreeExplainer(model=model, data=data_for_one_query_df)

# get shapley values
# item_to_explain_df is the item I am trying to explain
shap_values = explainer.shap_values(data=item_to_explain_df)

# plot 
shap.force_plot(explainer.expected_value, shap_values, item_to_explain_df)

Do you think this approach is valid for local explanations

jaz7290 commented 4 years ago

Hello, I'm Jaspreet. One of the authors of the aforementioned paper. We have a couple of approaches for rankings that we recently published: https://dl.acm.org/doi/pdf/10.1145/3351095.3375234 (for textual rankers) https://arxiv.org/pdf/2004.13972.pdf (for more traditional learning to rank with manually constructed features)

We are in the process of releasing a package combining these approaches and the paper kretes mentioned. In the meanwhile, we are happy to provide any support on integrating these approaches into existing interpretability packages. Feel free to get in touch with me singh@l3s.de