bgreenwell / fastshap

Fast approximate Shapley values in R
https://bgreenwell.github.io/fastshap/
112 stars 18 forks source link

is it possible to extract the intercept / bias / base value with fastshap? #33

Closed marboe123 closed 2 years ago

marboe123 commented 2 years ago

Hello,

Thank you for the nice package!

Is it also possible to extract the intercept (or ''bias'' or ''base value'') in some way with fastshap? Or is it possible to calculate these from the shap values as extracted with explain?

Thanks a lot!

bgreenwell commented 2 years ago

Hi @marboe123, thank you for the note. It is not currently possible to extract the bias term directly, but if you set adjust = TRUE, then you can ensure that the row sums correspond to f(x) - mean(response); from this, you can obtain the associated bias term if you wish. See ?fastshap::explain for deets on the adjust argument. Does that help with your question? Feel free to post a reproducible example if you need more help.

marboe123 commented 2 years ago

Hi Brandon @bgreenwell , thank you for your explanation.

I think I understand. Although can it be correct that there is no difference between setting adjust = TRUE and adjust = FALSE in the example below?

library(xgboost)
library(vip)
library(fastshap)
library(SHAPforxgboost) #to load the dataX

y_var <-  "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])

# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror",  # For regression
                   eta = 0.02,
                   max_depth = 10,
                   gamma = 0.01,
                   subsample = 0.95
)
mod <- xgboost::xgboost(data = dataX, 
                        label = as.matrix(dataXY_df[[y_var]]), 
                        params = param_list, nrounds = 10,
                        verbose = FALSE, nthread = parallel::detectCores() - 2,
                        early_stopping_rounds = 8)

# Compute shapley values 
shap1 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = FALSE)
shap2 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = TRUE)

shap1 <- as.data.frame(shap1)
shap2 <- as.data.frame(shap2)

difference <- setdiff(shap1, shap2)
bgreenwell commented 2 years ago

Correct, the adjustment only applies to the sampling procedure that is used when no exact method exists.

marboe123 commented 2 years ago

I have tried to implement the f(x) - mean(response) below as you mentioned, although I do not succeed to obtain the associated bias term.

library(xgboost)
library(vip)
library(fastshap)
library(SHAPforxgboost) #to load the dataX

y_var <-  "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])

# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror",  # For regression
                   eta = 0.02,
                   max_depth = 10,
                   gamma = 0.01,
                   subsample = 0.95
)
mod <- xgboost::xgboost(data = dataX, 
                        label = as.matrix(dataXY_df[[y_var]]), 
                        params = param_list, nrounds = 10,
                        verbose = FALSE, nthread = parallel::detectCores() - 2,
                        early_stopping_rounds = 8)

# Compute shapley values 
shap2 <- explain(mod, X = dataX, exact = TRUE, nsim = 1000, adjust = TRUE)

row_sums <- rowSums(shap2)
shap2 <- cbind(shap2, row_sums)

pred <- predict(mod, dataX)

meanpred <- mean(dataXY_df$diffcwv)
bgreenwell commented 2 years ago

Hi @marboe123, you're close, but it's the average training prediction, not mean response. I made some adjustments to your example to show how to get the bias term and comfirm the results by comparing directly to XGBoost:

library(xgboost)
library(fastshap)
library(SHAPforxgboost) #to load the dataX

y_var <-  "diffcwv"
dataX <- as.matrix(dataXY_df[,-..y_var])

# hyperparameter tuning results
param_list <- list(objective = "reg:squarederror",  # For regression
                   eta = 0.02,
                   max_depth = 10,
                   gamma = 0.01,
                   subsample = 0.95
)
mod <- xgboost(data = dataX, label = as.matrix(dataXY_df[[y_var]]), 
               params = param_list, nrounds = 10, verbose = FALSE, 
               nthread = parallel::detectCores() - 2, early_stopping_rounds = 8)

# Grab SHAP values directly from XGBoost
shap <- predict(mod, newdata = dataX, predcontrib = TRUE)

# Compute shapley values 
shap2 <- explain(mod, X = dataX, exact = TRUE, adjust = TRUE)

# Compute bias term; difference between predictions and sum of SHAP values
pred <- predict(mod, newdata = dataX)
head(bias <- pred - rowSums(shap2))

# Compare to output from XGBoost
head(shap[, "BIAS"])

# Check that SHAP values sum to the difference between pred and mean(pred)
head(cbind(rowSums(shap2), pred - mean(pred)))
marboe123 commented 2 years ago

Hi Brandon @bgreenwell , thanks a lot for your explanation and adjusting the code! The concept is much more clear to me now.