How does ROCAUC work in score_array()?

Seems like there's something wrong with score_array() in the classification case.

https://github.com/materialsproject/matbench/blob/c3b910e4f06b79eea1a8a6c7b67ea5a605948306/matbench/data_ops.py#L83-L123

accuracy comes before rocauc in CLF_METRICS:

CLF_METRICS = ["accuracy", "balanced_accuracy", "f1", "rocauc"]

That means this code will convert the predictions to labels:

# Other clf metrics always be converted to labels
elif metric in CLF_METRICS:
    if isinstance(pred_array[0], float):
        pred_array = homogenize_clf_array(pred_array, to_labels=True)

in which case afterwards

if metric == "rocauc":
    # Both arrays must be in probability form
    # if pred. array is given in probabilities
    if isinstance(pred_array[0], float):
        true_array = homogenize_clf_array(true_array, to_probs=True)

will never be true and so you'd be trying to compute an ROCAUC from true labels vs predicted labels? Maybe I'm missing something?

materialsproject / matbench

How does ROCAUC work in score_array()? #137