'BaseModelOutput' object has no attribute '_OrderedDict__map' when using Wav2Vec 2.0

joeyism commented 2 years ago

System Info

- `transformers` version: 4.18.0
- Platform: Linux-5.4.0-109-generic-x86_64-with-glibc2.27
- Python version: 3.8.8
- Huggingface_hub version: 0.5.1
- PyTorch version (GPU?): 1.10.1+cu102 (False)
- Tensorflow version (GPU?): 2.5.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: no
- Using distributed or parallel set-up in script?: no

Who can help?

@patrickvonplaten, @anton-l

Information

[X] The official example scripts
[ ] My own modified scripts

Tasks

[X] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[ ] My own task or dataset (give details below)

Reproduction

Here is the code

import soundfile as sf
import torch
from datasets import load_dataset
from transformers import Wav2Vec2ForCTC, Wav2Vec2Processor

# load pretrained model
processor = Wav2Vec2Processor.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

librispeech_samples_ds = load_dataset("patrickvonplaten/librispeech_asr_dummy", "clean", split="validation")

# load audio
audio_input, sample_rate = sf.read(librispeech_samples_ds[0]["file"])

# pad input values and return pt tensor
input_values = processor(audio_input, sampling_rate=sample_rate, return_tensors="pt").input_values

# INFERENCE

# retrieve logits & take argmax
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)

# transcribe
transcription = processor.decode(predicted_ids[0])

which is taken from the official documentation

Expected behavior

Not throw an error. However, I get the error

AttributeError: 'BaseModelOutput' object has no attribute '_OrderedDict__map'

joeyism commented 2 years ago

I was able to reproduce this pythonically on python 3.8.8

from collections import OrderedDict
from dataclasses import dataclass

class A(OrderedDict):
    def __post_init__(self):
        self["a"] = 1

@dataclass
class B(A):
    some_val = None

b = B()

(A would be ModelOutput in this case, and B would be BaseModelOutput) which throws the same error

AttributeError: 'B' object has no attribute '_OrderedDict__map'

joeyism commented 2 years ago

I updated the python version to 3.8.13 and it worked

- `transformers` version: 4.8.2
- Platform: Linux-5.4.0-109-generic-x86_64-with-glibc2.27
- Python version: 3.8.13
- PyTorch version (GPU?): 1.10.1+cu102 (False)
- Tensorflow version (GPU?): 2.5.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: No
- Using distributed or parallel set-up in script?: No

huggingface / transformers

'BaseModelOutput' object has no attribute '_OrderedDict__map' when using Wav2Vec 2.0 #17144

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior