Ensembling of predictions

Hello @andwurl . Are those 5 models trained using the same dataset (or bootstrap samples from the same dataset) - ex.: a bagging estimator? If that is the case, I propose that you use our abstraction called XGBSEBootstrapEstimator. It is a bagging estimator that receives a base model and the number of models to train. Since we already have it implemented, we suggest its usage. If you are want to know further, the aggregation of predictions in a test dataset is made using the following:

Point estimate: take the mean of the survival curves predicted from all n_models used.
If you are interested in using confidence intervals (upper and lower intervals): take percentiles from the survival curves predicted from the n_models used. Ex.: you have 5 base models, you will have 5 survival values for each point in time. Retrieve percentile 20 and 80 as inferior and superior intervals for each point in time, for example.

You can find examples on how to use our module XGBSEBootstrapEstimator in our "how_xgbse_works" notebook.

Code example (read the beginning of the notebook to get the necessary import statements and constants/parameters used below) - the examples uses a XGBSEDebiasedBCE as base_model, but it is also available for the XGBSEStackedWeibull:

# base model as BCE
base_model = XGBSEDebiasedBCE(PARAMS_XGB_AFT, PARAMS_LR)

# bootstrap meta estimator
bootstrap_estimator = XGBSEBootstrapEstimator(base_model, n_estimators=20)

# fitting the meta estimator
bootstrap_estimator.fit(
    X_train,
    y_train,
    validation_data=(X_valid, y_valid),
    early_stopping_rounds=10,
    time_bins=TIME_BINS,
)

# predicting
mean, upper_ci, lower_ci = bootstrap_estimator.predict(X_test, return_ci=True)

loft-br / xgboost-survival-embeddings

Ensembling of predictions #59