huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
https://huggingface.co/docs/evaluate
Apache License 2.0
2.01k stars 259 forks source link

Cannot use f1/recall/precision arguments in `CombinedEvaluations.compute` #234

Open fcakyon opened 2 years ago

fcakyon commented 2 years ago

This works:

metric=evaluate.load('f1')
metric.compute(references=[0, 1, 0, 1, 0], predictions=[0, 0, 1, 1, 0], average=None)

This won't work:

metric=evaluate.combine(["f1"])
metric.compute(references=[0, 1, 0, 1, 0], predictions=[0, 0, 1, 1, 0], average=None)

Reason: average is not included in f1 score features: https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/metrics/f1/f1.py#L112

and CombinedEvaluations.compute ignore if argument is not included in features: https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/src/evaluate/module.py#L858

Is this expected or a bug?

fcakyon commented 2 years ago

@lvwerra do you have any opinion on that?

lvwerra commented 2 years ago

Yes, this is a current limitation of combine: you can't pass any settings to compute only the features. Rather than fixing this in combine we aim to enable changing the settings when the metric is loaded in #169. This should be coming in the next few weeks.

m-movahhedinia commented 2 years ago

@lvwerra Do you mind if I ask whether you have an estimate on when this issue will be closed? I reviewed #188 and noticed that it seems to be passing for pip release. Is there a chance we can git it in the next couple of days?

falcaopetri commented 2 years ago

@lvwerra I also need to pass custom kwargs to my metrics. My current workaround is to overwrite CombinedEvaluations .compute in a child class and remove https://github.com/huggingface/evaluate/blob/eaf34a7d04e7ab3e6155a046f6d7fda01d9ead84/src/evaluate/module.py#L858

This works in my specific use case because all my (custom) metrics accept **kwargs in their compute method, which means they will just ignore extra args.

ernestchu commented 1 year ago

Has this been resolved?

lvwerra commented 1 year ago

Not yet. There is an issue with the sync mechanism between the hub and the library which is why we had to roll back #169. Merging it will break pre 0.3.0 installs so we need to wait for sufficient adaptation.