Base Score - Wrong scale

First of all, thank you for providing such an exciting utility.

In the code, the base score is assumed to be in the logit scale (for example, when defining tree_leafs in XGBScorecardConstructor.construct_scorecard). However, the following MRE shows that the base score seems to be instead in the probability scale:

import xgboost as xgb
from sklearn.datasets import load_breast_cancer
from xgboost import DMatrix
from xbooster.constructor import XGBScorecardConstructor

import numpy as np
from scipy.special import expit

data = load_breast_cancer()
X, y = data.data, data.target

# Build and fit classifier with only two trees
model = xgb.XGBClassifier(n_estimators=2, eval_metric='logloss')
model.fit(X, y)

# Retrieve base score using constructor class
scorecard_constructor = XGBScorecardConstructor(
    model, X, y
)
bscore = scorecard_constructor.base_score

# Return scores using individual trees
individual_preds = [tree.predict(DMatrix(X), output_margin=True) for tree in model.get_booster()]
# hacky way to get base score
# base_score_matrix = (base score + score_T1)+ (base_score + score_T2) - (base_score + score_T1 + score_T2) = base_score 
# NOTE: this must be in the logit scale
base_score_matrix = sum(individual_preds) - model.predict(X, output_margin=True)
# Transform into probability scale
base_score_matrix = expit(base_score_matrix)

# Assertion
check_arr = np.isclose(base_score_matrix, bscore)
assert np.all(check_arr), "Base score not in probability scale"

The trick is to use two trees only to have the base score repeated twice in their individual predictions, and then substract the actual prediction in the raw score scale (logit) to identify the actual raw score being used, again in the logit scale. Then I compare it against the base score used in the XGBScorecardConstructor class. They only match after transforming the former into the probability scale, that is why the assertion does not fail. I would advise to add this idea as a test to the test suite at some point.

Happy to help if this indeed needs some fixing 😄

xRiskLab / xBooster

Base Score - Wrong scale #2