ValueError: Predictions and/or references don't match the expected format.

Getting this error when trying to compute IoU using the Huggingface example. Flattening arrays does not solve the problem as this issue suggests.

Steps to reproduce:

import numpy as np
import evaluate

mean_iou = evaluate.load("mean_iou")
predicted = np.array([[2, 2, 3], [8, 2, 4], [3, 255, 2]])
ground_truth = np.array([[1, 2, 2], [8, 2, 1], [3, 255, 1]])
results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)

Using evaluate 0.4.1 numpy 1.26.1

Full error:

Traceback (most recent call last):
  File "/home/anba/catkin_ws/src/tas_dev/dev/anba/SAM/test.py", line 7, in <module>
    results = mean_iou.compute(predictions=predicted, references=ground_truth, num_labels=10, ignore_index=255)
  File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/evaluate/module.py", line 450, in compute
    self.add_batch(**inputs)
  File "/home/anba/anaconda3/envs/SAM/lib/python3.10/site-packages/evaluate/module.py", line 541, in add_batch
    raise ValueError(error_msg) from None
ValueError: Predictions and/or references don't match the expected format.
Expected format: {'predictions': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None), 'references': Sequence(feature=Sequence(feature=Value(dtype='uint16', id=None), length=-1, id=None), length=-1, id=None)},
Input predictions: [  2   2   3   8   2   4   3 255   2],
Input references: [  1   2   2   8   2   1   3 255   1]

huggingface / evaluate

ValueError: Predictions and/or references don't match the expected format. #563