I think it is handy to have a classification report in DEBUG.
It should work only for classification to grasp the current experiment fast, e..g (after F1 score),
DEBUG: modelrunner: run: 0 epoch: 0: result: test: 0.649 UAR
DEBUG: modelrunner: plotting confusion matrix to train_dev_test_label_svm_os_scale-standard_0_000_cnf
DEBUG: reporter: Saved confusion plot to ./tmp/results/exp_svd_os_svm/./images/run_0/train_dev_test_label_svm_os_scale-standard_0_000_cnf.png
DEBUG: reporter: Best score at epoch: 0, UAR: .648, (+-.581/.711), ACC: .721
DEBUG: reporter: labels: ['n', 'p']
DEBUG: reporter: auc: 0.649, pauc: 0.513 from epoch: 0
DEBUG: reporter: result per class (F1 score): [0.505, 0.806] from epoch: 0
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
DEBUG: experiment: Done, used 3.287 seconds
DONE
I think it is handy to have a classification report in DEBUG. It should work only for classification to grasp the current experiment fast, e..g (after F1 score),
Scikit-learn reference: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
What do you think, @felixbur?