Adds query_confusion_matrix to metrics.py. This function returns TP, FP, FN, TN as a struct. If all_metrics is set to True, it will return all 25 confusion matrix metrics as a struct.
Some notes:
I created a new tests file, test_metrics.py. I think it might be nice factor out some of the tests in test_correctness out to separate files. That's just a personal preference though; let me know if I should just add the tests to test_correctness.
I am not too sure what the best names for all of the metrics are. Let me know if you want any of them changed.
I think it makes the most sense to accept two boolean series instead of a score and a threshold. I went with score + threshold now to mimic query_binary_metrics. I also thought it might be confusion to have the pred argument refer to either a score or a predicted label in different functions.
Adds query_confusion_matrix to metrics.py. This function returns TP, FP, FN, TN as a struct. If all_metrics is set to True, it will return all 25 confusion matrix metrics as a struct.
Some notes:
pred
argument refer to either a score or a predicted label in different functions.