pygda-team / pygda

PyGDA is a Python library for Graph Domain Adaptation
https://pygda.readthedocs.io/en/stable/
MIT License
17 stars 2 forks source link

about AUC #3

Open FinchNie opened 3 weeks ago

FinchNie commented 3 weeks ago

Hi! Thanks for your great work.

It seems that there may be an error in the AUC calculation. Since the result of logits has a shape of [n, 2], the input for calculating the AUC score should be the values from either logits[:, 1] or logits[:, 0], which would serve as the score values. Using torch.max(logits, dim=1)[0] as the AUC score input might not be correct in this context.

# evaluate the performance
logits, labels = model.predict(target_data)

maxvalue, maxindex = torch.max(logits, dim=1)

preds = logits.argmax(dim=1)

mi_f1 = eval_micro_f1(labels, preds)
ma_f1 = eval_macro_f1(labels, preds)

if args.source in {'DE', 'EN', 'ES', 'FR', 'PT', 'RU'}:
    auc = eval_roc_auc(labels, maxvalue)
else:
    auc = 0.0
cszhangzhen commented 3 weeks ago

Hi!

Thanks for pointing this out.

You are right. This is a typo. It should be maxindex.

Thanks for your feedback.

FinchNie commented 3 weeks ago

Thank you for your reply!

To clarify, the AUC calculation ideally requires a continuous score to assess the model’s performance across various thresholds, rather than using a discrete label output. In this case, logits[:, 1] would serve as the continuous score representing the confidence for the positive class. Therefore, using maxindex (which contains 0 or 1 values) might not provide the most accurate AUC measurement, as it essentially reduces the calculation to a single threshold.

Thank you again for considering this!

cszhangzhen commented 3 weeks ago

Thanks for your reply! You are right. We should use logits[:, 1] instead of maxindex for more accurate evaluation.

Thanks for your suggestion! We will update the scripts later.