DCGM / lm-evaluation-harness

A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
0 stars 2 forks source link

MCAuroc is NaN? #3

Closed MFajcik closed 6 months ago

MFajcik commented 6 months ago

For instance:

- prompt-0 1 none 10 acc 0.40130000000000000 ± 0.0399
none 10 macro_f1 0.16588263245426468 ± N/A
none 10 macro_f1_ci 0.1639/0.2949 ± N/A
none 10 avg_mcauroc NaN ± N/A
none 10 avg_mcauroc_ci 0.2921/0.7447 ± N/A
MFajcik commented 6 months ago

The problem was caused by some classes not being predicted at all. The issue was "fixed" by setting AUC of such classes to 0.