ThilinaRajapakse / simpletransformers

Transformers for Information Retrieval, Text Classification, NER, QA, Language Modelling, Language Generation, T5, Multi-Modal, and Conversational AI
https://simpletransformers.ai/
Apache License 2.0
4.08k stars 727 forks source link

What's the difference between metric "auroc" and sklearn.metrics.roc_auc_score? #1239

Closed hjzhang1018 closed 2 years ago

hjzhang1018 commented 3 years ago

I used ClassificationModel for my dataset training. However when I called the model.eval_model() function on my test dataset, I got different scores of "auroc" and "roc_auc_score". Code is like below:

model = ClassificationModel()
model.eval_model(eval_df=test_df, output_dir=model_dir, 
                      roc_auc=sklearn.metrics.roc_auc_score, 
                      )

I got results like this:

({'auprc': 0.5154601291728336, 'auroc': 0.8368392799265343, 'eval_loss': 4.512793393797144, 'fn': 20, 'fp': 3379, 'roc_auc': 0.6193144987035437, 'tn': 1249, 'tp': 620})

So I wonder what’s the difference between "auroc" and "roc_auc". The former seems like to be a default metric but I didn’t find any introductions of it. Besides, I'm not very clear about "auprc" as well. Does it equal to "sklearn.metrics.precision_recall_curve"? Can anyone please explain this to me? I appreciate your help!

mar-volk commented 2 years ago

Hi! I was wondering about the same issue and I found what makes the difference. The auroc value is calculated from the floating-point score values, and this is the right way to calculate the roc-auc-score (see here). In contrast, your roc_auc value is calculated from the binary output of the classifier (see here).

If you want to calculate the roc-auc-score via an extra metric, you need to use a variable name that starts with "prob_" (see here). Then the complete model output is passed over to your function. The following should work

from scipy.special import softmax

def custom_roc_auc_score(labels, model_outputs):
    return sklearn.metrics.roc_auc_score(labels, softmax(model_outputs, axis=1)[:,1])

model.eval_model(
    eval_df,
    prob_roc_auc=custom_roc_auc_score
)
stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.