Ability to pass separate training data sets in for xgboost model vs survival model

I'm wondering if you've considered allowing the user to pass in a separate training sets for the xgboost model vs the survival model?

For example, in XGBSEStackedWeibull, the current state is this:

Train xgboost on X_train, y_train
Predict back on X_train using model from (1), resulting in risk scores
Train Weibull AFT model with risk scores from (2) and y_train

I'm proposing this:

Train xgboost on X_train, y_train
Predict risk scores of X_train_2 using model from (1)
Train Weibull AFT model using risk scores from (2) and y_train_2

The rationale for having different datasets used between the models is that it reduces the chance of overfitting. I've found that the risk scores that come out of step 2 are indicating a tighter relationship between risk score and y_train than there actually is, by nature of the fact that we are predicting back on the dataset that the xgboost model was trained on (and then re-relating things to the original outcome variable, y_train).

Thanks for the awesome package

loft-br / xgboost-survival-embeddings

Ability to pass separate training data sets in for xgboost model vs survival model #56