Closed spedygiorgio closed 5 years ago
How do you integrate the init_score with prediction in regression (gamma, l1, l2) for now?
Closed in favor of being in #2302. We decided to keep all feature requests in one place.
Welcome to contribute this feature! Please re-open this issue (or post a comment if you are not a topic starter) if you are actively working on implementing this feature.
Actually it seems not possible to seamlessy include the
init_score
in prediction. It would be nice the predict method to handle an init_score, if given. E.g. I am boosting a model with an apriori predictiction:using a general model to calculate base predictions in raw scale `
calculating initial raw score
base_fraud_raw= lgb_general_model.predict(X, raw_score=True) `
then i (re)create train and test sets `
recreating train and test sets
X_train, X_tmp, y_train, y_tmp, rw_train, rw_tmp = train_test_split(X, y,base_raw_score, test_size=0.3, stratify=y) X_valid, X_test, y_valid, y_test, rw_valid, rw_test = train_test_split(X_tmp, y_tmp,rw_tmp, test_size=0.5, stratify=y_tmp) del X_tmp,y_tmp,rw_tmp `
then I create the lgb Datasets and tune the model `
defining lgb Dataframes(s)
lgb_full_categorical_predictors=binarized_predictors_generic+categorical_predictors_generic+binarized_predictors_30+categorical_predictors_30 lgb_train_30 = lgb.Dataset(data=X_train.values, label=y_train.values, feature_name=X_train.columns.tolist(),categorical_feature=lgb_full_categorical_predictors, free_raw_data=False,init_score = rw_train) lgb_valid_30 = lgb.Dataset(data=X_valid.values, label=y_valid.values, reference=lgb_train,feature_name=X_valid.columns.tolist(),categorical_feature=lgb_full_categorical_predictors, free_raw_data=False,init_score = rw_valid) lgb_full_30= lgb.Dataset(data=X.values, label=y.values, reference=lgb_train,feature_name=X.columns.tolist(),categorical_feature=lgb_full_categorical_predictors, free_raw_data=False,init_score = base_raw_score)
tune the models
lgb_model_30= lgb.train(params=lgb_general_params,train_set=lgb_train_30,valid_sets=[lgb_valid_30],early_stopping_rounds=10) `
then I calculate predictions (using the raw scores)
`
function to get probabilities from raw
def softmax(x): """Compute softmax values for each sets of scores in x.""" e_x = np.exp(x) out = e_x / (1+ e_x) return out
predict using raw score
raw_temp=lgb_model_30.predict(X_test, raw_score=True)+rw_test proba = softmax(raw_temp)
caculating performance
my_roc_auc_score= roc_auc_score(y_test,proba) `
If init_score could have been prodived as supplementary parameter to lgb_model_30.predict method, I wold have avoided the need to know the right transformation (what is in gamma regression, in poison one in box cox ones,...) and to performa the calculation in the probability scale manually.
Is it possible to integrate init_score in the predict method.
This issue is related to #1778 and #1969