mad-lab-fau / BioPsyKit

A Python package for the analysis of biopsychological data.
MIT License
39 stars 8 forks source link

Classification - add test-indices to summary #22

Closed katharina-j-fau closed 2 years ago

katharina-j-fau commented 2 years ago
  1. In nested_cv.py nested_cv_param_search() add train and test indices to cols and results_dict:
cols = [
        "param_search",
        "cv_results",
        "best_estimator",
        "conf_matrix",
        "predicted_labels",
        "true_labels",
        "train_indices",      #add this line
        "test_indices"        #add this line
    ]
results_dict["train_indices"].append(train)     #add this line
results_dict["test_indices"].append(test)       #add this line
results_dict["predicted_labels"].append(cv_obj.predict(x_test))
results_dict["true_labels"].append(y_test)
results_dict["cv_results"].append(cv_obj.cv_results_)
results_dict["best_estimator"].append(cv_obj.best_estimator_)
results_dict["conf_matrix"].append(confusion_matrix(y_test, cv_obj.predict(x_test), normalize=None))
  1. in sklearn_pipeline_permuter.py metric_summary() get test indices and add it to df_metric:
for param_key, param_value in self.param_searches.items():
      ...
      test_indices = np.array(param_value["test_indices"], dtype="object").ravel()
      ...
      df_metric["test_indices"] = [test_indices]

      for key in param_values:
            if "test" in key:
                  if "test_indices" in key:
                        continue
  1. Optional: Find a more elegant way to exclude test_indices from metric calculation ;)
richrobe commented 2 years ago

Added in cc3cd86a454fe62a4374990eebc046561db3e983