huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.
https://huggingface.co/docs/evaluate
Apache License 2.0
1.9k stars 235 forks source link

ValueError: Predictions and/or references don't match the expected format. #563

Open antopost opened 4 months ago

antopost commented 4 months ago

Getting this error when trying to compute IoU using the Huggingface example. Flattening arrays does not solve the problem as this issue suggests.

Steps to reproduce:

import numpy as np
import evaluate

mean_iou = evaluate.load("mean_iou")
predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)

Using evaluate 0.4.1 numpy 1.26.1

Full error:

Traceback (most recent call last):
  File "/home/anba/catkin_ws/src/tas_dev/dev/anba/SAM/test.py", line 7, in <module>
    results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
  File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/evaluate/module.py", line 450, in compute
    self.add_batch(**inputs)
  File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/evaluate/module.py", line 541, in add_batch
    raise ValueError(error_msg) from None
ValueError: Predictions and/or references don't match the expected format.
Expected format: {'predictions': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None), 'references': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None)},
Input predictions: [  2   2   3   8   2   4   3 255   2],
Input references: [  1   2   2   8   2   1   3 255   1]
antopost commented 4 months ago

Fixed it. The error message does give some clues. The np arrays must be uint16 and they must be passed within a list. That being said, it doesn't really make sense to me why this would be the case. The example code in the docs should work as well.

Here's the fixed example:

import numpy as np
import evaluate

mean_iou = evaluate.load("mean_iou")
predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]], dtype=np.uint16)
ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]], dtype=np.uint16)
results = mean_iou.compute(predictions=[predicted], references=[ground_truth], num_labels=10, ignore_index=255)