Currently, the measure of generalization error used in summary() is the correlation between the pooled predictions and the real values. But sklearn warns against doing so:
Note on inappropriate usage of cross_val_predict
The result of cross_val_predict may be different from those obtained using cross_val_score as the elements are grouped in different ways. The function cross_val_score takes an average over cross-validation folds, whereas cross_val_predict simply returns the labels (or probabilities) from several distinct models undistinguished. Thus, cross_val_predict is not an appropriate measure of generalization error.
A sounder method is calculating generalization errors separately for each fold, and the average them. But Pearson correlations might need special treating.
Of course, the current method just follows the original paper. So, we would leave this issue open here because it might not be proper to implement another measure for now.
Currently, the measure of generalization error used in
summary()
is the correlation between the pooled predictions and the real values. But sklearn warns against doing so:A sounder method is calculating generalization errors separately for each fold, and the average them. But Pearson correlations might need special treating.