Error on running run_emotion.py

bagustris commented 2 months ago

Thanks @TideDancer for this great project,

When I run run_emotion.py, after extracting all wav files in IEMOCAP and replace the path in CSV files, I got the following error.

...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5024/5024 [33:16<00:00,  2.52ex/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 507/507 [02:57<00:00,  2.86ex/s]
05/20/2024 16:04:03 - WARNING - __main__ -   Updated 5533 transcript(s) using 'librispeech' orthography rules.
Traceback (most recent call last):
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/feature_extraction_utils.py", line 158, in convert_to_tensors
    tensor = as_tensor(value)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneou
s part.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "run_emotion.py", line 648, in <module>
    main()
  File "run_emotion.py", line 530, in main
    train_dataset = train_dataset.map(
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 1289, in map
    update_data = does_function_return_dict(test_inputs, test_indices)
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/datasets/arrow_dataset.py", line 1260, in does_function_return_dict
    function(*fn_args, indices, **fn_kwargs) if with_indices else function(*fn_args, **fn_kwargs)
  File "run_emotion.py", line 515, in prepare_dataset
    batch["input_values"] = processor(
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/models/wav2vec2/processing_wav2vec2.py", line 117, in __call__
    return self.current_processor(*args, **kwargs)
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/models/wav2vec2/feature_extraction_wav2vec2.py", line 186, in __call_
_
    padded_inputs = self.pad(
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/feature_extraction_sequence_utils.py", line 225, in pad
    return BatchFeature(batch_outputs, tensor_type=return_tensors)
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/feature_extraction_utils.py", line 73, in __init__
    self.convert_to_tensors(tensor_type=tensor_type)
  File "/home/bagus/miniconda3/envs/interspeech_2021/lib/python3.8/site-packages/transformers/feature_extraction_utils.py", line 164, in convert_to_tensors
    raise ValueError(
ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.

Is there any clue on this? I suspect that the error is caused by a different version of my transformers and dataset from yours.

TideDancer commented 1 month ago

Very likely. Can you try transformers==4.4.2 ？ or older version ?

bagustris commented 1 month ago

@TideDancer

I tried per requirements.txt for the required packages that give the error above. Didn't try on older versions though.

(interspeech_2021) pc060066:interspeech21_emotion(main)$ pip list | grep -E 'transformers|datasets'
datasets                          1.4.1
transformers                      4.4.2

TideDancer / interspeech21_emotion

Error on running run_emotion.py #19