Currently our second-stage model uses "naive double cross validation" which leads to indirect target leakage. We should base the second stage model's CV folds on the folds from model 1!
We should at least do something like, use folds 1-9 out of sample predictions to predict for fold 10.
Currently our second-stage model uses "naive double cross validation" which leads to indirect target leakage. We should base the second stage model's CV folds on the folds from model 1!
We should at least do something like, use folds 1-9 out of sample predictions to predict for fold 10.