ValueError: the ground truth cannot be an empty

FriedaSmith commented 3 years ago

Hello. When I run bash run.sh, it had an error. The error details are as follows:

2021-07-14 02:40:06.707607: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
07/14/2021 02:40:10 - WARNING - datasets.builder -   Using custom data configuration default-d1e8e83a11bc3400
07/14/2021 02:40:10 - WARNING - datasets.builder -   Reusing dataset csv (cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23)
07/14/2021 02:40:10 - WARNING - datasets.builder -   Using custom data configuration default-6fbd7f8ba1c8fd4d
07/14/2021 02:40:10 - WARNING - datasets.builder -   Reusing dataset csv (cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23)
Some weights of the model checkpoint at facebook/wav2vec2-base were not used when initializing Wav2Vec2ForCTCnCLS: ['quantizer.codevectors', 'project_hid.weight', 'project_q.weight', 'quantizer.weight_proj.bias', 'project_q.bias', 'project_hid.bias', 'quantizer.weight_proj.weight']
- This IS expected if you are initializing Wav2Vec2ForCTCnCLS from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing Wav2Vec2ForCTCnCLS from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of Wav2Vec2ForCTCnCLS were not initialized from the model checkpoint at facebook/wav2vec2-base and are newly initialized: ['lm_head.weight', 'cls_head.weight', 'cls_head.bias', 'lm_head.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-5daa63f5aff1bbe8.arrow
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-ca7eccd45e2dac1f.arrow
07/14/2021 02:40:13 - INFO - __main__ -   Split sizes: 811 train and 203 validation.
07/14/2021 02:40:13 - WARNING - __main__ -   Updated 0 transcript(s) using 'librispeech' orthography rules.
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-3845d802443c01c4.arrow
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-f7767c2ed2ffa99b.arrow
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-39f3e8cedc987665.arrow
07/14/2021 02:40:13 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-d1e8e83a11bc3400/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-ae2affcf84a33be1.arrow
07/14/2021 02:40:14 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-a27c27042166c13e.arrow
07/14/2021 02:40:14 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-eb02d2fe3588b7c0.arrow
07/14/2021 02:40:14 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-c0e0087635773178.arrow
07/14/2021 02:40:14 - WARNING - datasets.arrow_dataset -   Loading cached processed dataset at cache/csv/default-6fbd7f8ba1c8fd4d/0.0.0/e138af468cb14e747fb46a19c787ffcfa5170c821476d20d5304287ce12bbc23/cache-a15dda7c9ee9b954.arrow
Using amp fp16 backend
The following columns in the training set  don't have a corresponding argument in `Wav2Vec2ForCTCnCLS.forward` and have been ignored: sampling_rate, speech, text, emotion.
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:481: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  cpuset_checked))
***** Running training *****
  Num examples = 811
  Num Epochs = 100
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 8
  Gradient Accumulation steps = 4
  Total optimization steps = 10100
  0% 0/10100 [00:00<?, ?it/s]/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values.
To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)
/usr/local/lib/python3.7/dist-packages/transformers/trainer.py:1299: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  args.max_grad_norm,
{'loss': 172.4452, 'learning_rate': 9.965346534653466e-06, 'epoch': 0.49}
{'loss': 47.0051, 'learning_rate': 9.915841584158416e-06, 'epoch': 0.99}
  1% 100/10100 [01:39<3:08:18,  1.13s/it]The following columns in the evaluation set  don't have a corresponding argument in `Wav2Vec2ForCTCnCLS.forward` and have been ignored: sampling_rate, speech, text, emotion.
***** Running Evaluation *****
  Num examples = 203
  Batch size = 2
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)
[W pthreadpool-cpp.cc:90] Warning: Leaking Caffe2 thread-pool after fork. (function pthreadpool)

  0% 0/102 [00:00<?, ?it/s]
  2% 2/102 [00:00<00:05, 19.59it/s]
  4% 4/102 [00:00<00:05, 18.20it/s]
  6% 6/102 [00:00<00:05, 17.92it/s]
  8% 8/102 [00:00<00:06, 15.40it/s]
 10% 10/102 [00:00<00:06, 13.99it/s]
 12% 12/102 [00:00<00:07, 12.42it/s]
 14% 14/102 [00:01<00:06, 13.22it/s]
 16% 16/102 [00:01<00:06, 13.66it/s]
 18% 18/102 [00:01<00:06, 14.00it/s]
 20% 20/102 [00:01<00:07, 10.98it/s]
 22% 22/102 [00:02<00:11,  6.75it/s]
 24% 24/102 [00:02<00:10,  7.80it/s]
 25% 26/102 [00:02<00:08,  8.78it/s]
 27% 28/102 [00:02<00:07, 10.01it/s]
 29% 30/102 [00:02<00:06, 11.17it/s]
 31% 32/102 [00:02<00:06, 10.35it/s]
 33% 34/102 [00:03<00:06, 10.04it/s]
 35% 36/102 [00:03<00:07,  8.85it/s]
 37% 38/102 [00:03<00:07,  8.35it/s]
 40% 41/102 [00:03<00:06, 10.02it/s]
 42% 43/102 [00:04<00:05,  9.84it/s]
 44% 45/102 [00:04<00:05, 10.72it/s]
 46% 47/102 [00:04<00:04, 12.17it/s]
 48% 49/102 [00:04<00:05,  9.58it/s]
 50% 51/102 [00:04<00:04, 10.55it/s]
 52% 53/102 [00:04<00:04, 12.10it/s]
 54% 55/102 [00:05<00:04, 10.89it/s]
 56% 57/102 [00:05<00:03, 11.59it/s]
 58% 59/102 [00:05<00:03, 10.83it/s]
 60% 61/102 [00:05<00:03, 11.53it/s]
 62% 63/102 [00:05<00:03, 12.09it/s]
 64% 65/102 [00:06<00:03, 10.13it/s]
 66% 67/102 [00:06<00:03, 11.28it/s]
 68% 69/102 [00:06<00:02, 12.78it/s]
 70% 71/102 [00:06<00:02, 13.98it/s]
 73% 74/102 [00:06<00:01, 15.03it/s]
 75% 76/102 [00:07<00:03,  8.10it/s]
 77% 79/102 [00:07<00:02,  9.27it/s]
 79% 81/102 [00:07<00:01, 10.69it/s]
 81% 83/102 [00:07<00:01, 10.91it/s]
 84% 86/102 [00:07<00:01, 12.45it/s]
 86% 88/102 [00:07<00:01, 12.83it/s]
 88% 90/102 [00:08<00:00, 13.95it/s]
 90% 92/102 [00:08<00:00, 13.91it/s]
 92% 94/102 [00:08<00:01,  5.90it/s]
 94% 96/102 [00:09<00:00,  6.75it/s]
 96% 98/102 [00:09<00:00,  7.81it/s]
 98% 100/102 [00:09<00:00,  8.33it/s]Traceback (most recent call last):
  File "run_emotion.py", line 547, in <module>
    main()
  File "run_emotion.py", line 543, in main
    trainer.train(resume_from_checkpoint=checkpoint)
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 1325, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 1426, in _maybe_log_save_evaluate
    metrics = self.evaluate()
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 2031, in evaluate
    metric_key_prefix=metric_key_prefix,
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 2260, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "run_emotion.py", line 519, in compute_metrics
    wer = wer_metric.compute(predictions=ctc_pred_str, references=ctc_label_str)
  File "/usr/local/lib/python3.7/dist-packages/datasets/metric.py", line 402, in compute
    output = self._compute(predictions=predictions, references=references, **kwargs)
  File "/root/.cache/huggingface/modules/datasets_modules/metrics/wer/d630b0e978819dda4b232fbce9934c6221a04bb2fcea1bfe8e7cb177339b3d86/wer.py", line 103, in _compute
    measures = compute_measures(reference, prediction)
  File "/usr/local/lib/python3.7/dist-packages/jiwer/measures.py", line 188, in compute_measures
    truth, hypothesis, truth_transform, hypothesis_transform
  File "/usr/local/lib/python3.7/dist-packages/jiwer/measures.py", line 244, in _preprocess
    raise ValueError("the ground truth cannot be an empty")
ValueError: the ground truth cannot be an empty

  1% 100/10100 [01:50<3:04:43,  1.11s/it]

TideDancer commented 3 years ago

Em... looks like it is not reading the transcripiton. Can you share a couple of rows in your training.tsv file, such that I can check if it is compatible with the code. Most likely the code didn't find a correct way to load the labels.

FriedaSmith commented 3 years ago

The structure of iemocap folder is as follows: iemocap

Part of iemocap_01F.train.csv is as follows: iemocap_01F train

TideDancer commented 3 years ago

Looks like it is csv file reading issue. I think you need to add quote (") to the last column. I can see some have quote but some do not. If no quote, once there is a comma, it will be identified as two columns. Another thing is that I think you can remove all the cc ([BREATH], [LAUGHTER], etc). I didn't use any of them. Not sure what would happen if leave them there. Let me know if this works.

FriedaSmith commented 3 years ago

I removed all the cc ([BREATH], [LAUGHTER], etc) and added quote (") to the last column, but it's still this error.

100% 97/97 [00:08<00:00, 12.06it/s]Traceback (most recent call last):
  File "run_emotion.py", line 547, in <module>
    main()
  File "run_emotion.py", line 543, in main
    trainer.train(resume_from_checkpoint=checkpoint)
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 1325, in train
    self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 1426, in _maybe_log_save_evaluate
    metrics = self.evaluate()
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 2031, in evaluate
    metric_key_prefix=metric_key_prefix,
  File "/usr/local/lib/python3.7/dist-packages/transformers/trainer.py", line 2260, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "run_emotion.py", line 519, in compute_metrics
    wer = wer_metric.compute(predictions=ctc_pred_str, references=ctc_label_str)
  File "/usr/local/lib/python3.7/dist-packages/datasets/metric.py", line 402, in compute
    output = self._compute(predictions=predictions, references=references, **kwargs)
  File "/root/.cache/huggingface/modules/datasets_modules/metrics/wer/d630b0e978819dda4b232fbce9934c6221a04bb2fcea1bfe8e7cb177339b3d86/wer.py", line 103, in _compute
    measures = compute_measures(reference, prediction)
  File "/usr/local/lib/python3.7/dist-packages/jiwer/measures.py", line 188, in compute_measures
    truth, hypothesis, truth_transform, hypothesis_transform
  File "/usr/local/lib/python3.7/dist-packages/jiwer/measures.py", line 244, in _preprocess
    raise ValueError("the ground truth cannot be an empty")
ValueError: the ground truth cannot be an empty

  1% 100/9600 [01:50<2:55:30,  1.11s/it]

TideDancer commented 3 years ago

I see. I think I can prepare the csv files and update later after testing for you to run.

TideDancer commented 3 years ago

I just uploaded the csv files. You need to replace the 'path_to_wavs' string in the files, with your actual path that stores all the wavs. For example, if you store wavs at /wav_path/, just run: for f in iemocap/*.csv; do sed -i 's/\/path_to_wavs/\/wav_path/' $f; done (use the absolute path here).

I have tested and should be able to run. Let me know if this can work.

FriedaSmith commented 3 years ago

The .csv files you uploaded can work normally. Thank you very much for your help.

Coding511 commented 2 years ago

How these csv files are created? what is datasets module in code ?

import datasets please any one of you can help me with this?

TideDancer commented 1 year ago

How these csv files are created? what is datasets module in code ?

import datasets please any one of you can help me with this?

@Coding511 , The csv files are generated by parsing the IEMOCAP datasets. Once you obtain it, it should be easy to write a script to generate the csv files like what I have, or you can just use mine.

The datasets package are Huggingface datasets, see here: https://huggingface.co/docs/datasets/index.

TideDancer / interspeech21_emotion

ValueError: the ground truth cannot be an empty #2