casper-hansen / Nested-Cross-Validation

Nested cross-validation for unbiased predictions. Can be used with Scikit-Learn, XGBoost, Keras and LightGBM, or any other estimator that implements the scikit-learn interface.
MIT License
62 stars 20 forks source link

how does nested-cv compare to native nested CV in sklearn? #13

Open ezeeetm opened 4 years ago

ezeeetm commented 4 years ago

First of all, thank you very much for the work on this library, its much needed and looks very well maintained.

Just wonder, what are the functional differences between using the nested-cv library, and doing native nested CV in sklearn? (like this: https://scikit-learn.org/stable/auto_examples/model_selection/plot_nested_cross_validation_iris.html)

here's the interesting part of that for convenience:

# Choose cross-validation techniques for the inner and outer loops,
# independently of the dataset.
# E.g "GroupKFold", "LeaveOneOut", "LeaveOneGroupOut", etc.
inner_cv = KFold(n_splits=4, shuffle=True, random_state=i)
outer_cv = KFold(n_splits=4, shuffle=True, random_state=i)

# Non_nested parameter search and scoring
clf = GridSearchCV(estimator=svm, param_grid=p_grid, cv=inner_cv)
clf.fit(X_iris, y_iris)
non_nested_scores[i] = clf.best_score_

# Nested CV with parameter optimization
nested_score = cross_val_score(clf, X=X_iris, y=y_iris, cv=outer_cv)
nested_scores[i] = nested_score.mean()