flairNLP / flair

A very simple framework for state-of-the-art Natural Language Processing (NLP)
https://flairnlp.github.io/flair/
Other
13.97k stars 2.1k forks source link

[Bug]: Support metric is reported incorrectly for multi-class classifier model evaluation #3509

Closed MattGPT-ai closed 1 month ago

MattGPT-ai commented 4 months ago

Describe the bug

The classification report returned from running evaluate for a classifier model reports accuracy instead of support when it is not a multi-label classifier

To Reproduce

import flair

result = trainer.model.evaluate(test_set, gold_label_type=label_type)
print(result.classification_report['micro avg']['support'])

Expected behavior

Support should report the integer number of samples

Logs and Stack traces

No response

Screenshots

No response

Additional Context

No response

Environment

Versions:

Flair

0.13.1

Pytorch

2.3.1+cu121

Transformers

4.31.0

GPU

True

fkdosilovic commented 1 month ago

@alanakbik Looks like this bug can be closed.