In nested_cv.py nested_cv_param_search() add train and test indices to cols and results_dict:
cols = [
"param_search",
"cv_results",
"best_estimator",
"conf_matrix",
"predicted_labels",
"true_labels",
"train_indices", #add this line
"test_indices" #add this line
]
results_dict["train_indices"].append(train) #add this line
results_dict["test_indices"].append(test) #add this line
results_dict["predicted_labels"].append(cv_obj.predict(x_test))
results_dict["true_labels"].append(y_test)
results_dict["cv_results"].append(cv_obj.cv_results_)
results_dict["best_estimator"].append(cv_obj.best_estimator_)
results_dict["conf_matrix"].append(confusion_matrix(y_test, cv_obj.predict(x_test), normalize=None))
in sklearn_pipeline_permuter.py metric_summary() get test indices and add it to df_metric:
for param_key, param_value in self.param_searches.items():
...
test_indices = np.array(param_value["test_indices"], dtype="object").ravel()
...
df_metric["test_indices"] = [test_indices]
for key in param_values:
if "test" in key:
if "test_indices" in key:
continue
Optional: Find a more elegant way to exclude test_indices from metric calculation ;)
nested_cv_param_search()
add train and test indices tocols
andresults_dict
:metric_summary()
get test indices and add it todf_metric
:test_indices
from metric calculation ;)