clamsproject / aapb-evaluations

Collection of evaluation codebases
Apache License 2.0
0 stars 1 forks source link

asr_eval not returning the same number of files #38

Closed jarumihooi closed 3 months ago

jarumihooi commented 10 months ago

Bug Description

The exact cause of this issue is somewhat unknown. During familiarization of this subtask (asr_eval), the attempts to run this eval code did not return the expected number of files.

There seems to be missing error messages when files are skipped that would help account for some of the missing files. In other cases, there are still some missing.

Two

Reproduction steps

  1. Download asr_eval, load env for requirements.
  2. Manually download golds from aapb-collaboration/master to a local dir.
  3. Run batch_asr_eval.py on preds@whisper-wrapper-base@aapb-collaboration-21/ (or on the other preds also)

Expected behavior

  1. Expect 16 files generated in results dir. Get only 10.
  2. this code could be updated to use from clams_utils.aapb import goldretriever to have consistent behavior to other evals such as fa_eval.
  3. System output/error to show clearly when creation of a eval for a GUID fails, so that the same number of files can be expected.
    • (Note: it might be possible that running on terminal may be causing a lack of errors)

Screenshots

$ python3 batch_asr_eval.py --hyp-dir preds@whisper-wrapper-base@aapb-collaboration-21/ --gold-dir 21/
Processing file:  cpb-aacip-507-154dn40c26.whisper-base.mmif cpb-aacip-507-154dn40c26-transcript.txt
Processing file:  cpb-aacip-507-1v5bc3tf81.whisper-base.mmif cpb-aacip-507-1v5bc3tf81-transcript.txt
Processing file:  cpb-aacip-507-4746q1t25k.whisper-base.mmif cpb-aacip-507-4746q1t25k-transcript.txt
Processing file:  cpb-aacip-507-4t6f18t178.whisper-base.mmif cpb-aacip-507-4t6f18t178-transcript.txt
Processing file:  cpb-aacip-507-6h4cn6zk04.whisper-base.mmif cpb-aacip-507-6h4cn6zk04-transcript.txt
Processing file:  cpb-aacip-507-6w96689725.whisper-base.mmif cpb-aacip-507-6w96689725-transcript.txt
Processing file:  cpb-aacip-507-7659c6sk7z.whisper-base.mmif cpb-aacip-507-7659c6sk7z-transcript.txt
Processing file:  cpb-aacip-507-9882j68s35.whisper-base.mmif cpb-aacip-507-9882j68s35-transcript.txt
Processing file:  cpb-aacip-507-cf9j38m509.whisper-base.mmif cpb-aacip-507-cf9j38m509-transcript.txt
list index out of range
Processing file:  cpb-aacip-507-n29p26qt59.whisper-base.mmif cpb-aacip-507-n29p26qt59-transcript.txt
Processing file:  cpb-aacip-507-nk3610wp6s.whisper-base.mmif cpb-aacip-507-nk3610wp6s-transcript.txt
list index out of range
Processing file:  cpb-aacip-507-pc2t43js98.whisper-base.mmif cpb-aacip-507-pc2t43js98-transcript.txt

This run has only 10 items out of 20. There are 20 items in the preds, and 20 in the golds.

(swtdetector_env) kay@Zeffi:/mnt/c/Users/J/path/aapb-evaluations-run/asr_eval$ diff <(ls 21/ | cut -c 1-
24 | sort -u) <(ls /mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/wer_results | cut -c 1-24 | sort -u
)
9d8
< cpb-aacip-507-cf9j38m509
11d9
< cpb-aacip-507-nk3610wp6s
13,21d10
< cpb-aacip-507-r785h7cp0z
< cpb-aacip-507-vm42r3pt6h
< cpb-aacip-507-zk55d8pd1h
< cpb-aacip-507-zw18k75z4h
< cpb-aacip-525-028pc2v94s
< cpb-aacip-525-3b5w66b279
< cpb-aacip-525-9g5gb1zh9b
< cpb-aacip-525-bg2h70914g
< list.csv

Additional context

There are three errors to be found that would tally up the number of files:

  1. No system output shown, simply missing a few files.
  2. Traceback error for list index out of range: This is caused by the mmif module attempting to process the golds data.
  3. StopIteration error (which requires code changes to be shown to system output) which report when a file does not have a matching gold, these are the expected 4 missing ones.

It seems like one of the try/except clauses might be misaligned and not causing an error to be shown. Now there are 14 correctly outputted files, 2 list index errors, and 4 expected StopIteration errors.

(swtdetector_env) kay@Zeffi:/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval$ py batch_asr_eval.py --hyp-dir preds@whisper-wrapper-base@aapb-collaboration-21/ --gold-dir 21
[Errno 2] No such file or directory: 'wer_results'
21
21
Processing file:  cpb-aacip-507-154dn40c26.whisper-base.mmif cpb-aacip-507-154dn40c26-transcript.txt
Processing file:  cpb-aacip-507-1v5bc3tf81.whisper-base.mmif cpb-aacip-507-1v5bc3tf81-transcript.txt
Processing file:  cpb-aacip-507-4746q1t25k.whisper-base.mmif cpb-aacip-507-4746q1t25k-transcript.txt
Processing file:  cpb-aacip-507-4t6f18t178.whisper-base.mmif cpb-aacip-507-4t6f18t178-transcript.txt
Processing file:  cpb-aacip-507-6h4cn6zk04.whisper-base.mmif cpb-aacip-507-6h4cn6zk04-transcript.txt
Processing file:  cpb-aacip-507-6w96689725.whisper-base.mmif cpb-aacip-507-6w96689725-transcript.txt
Processing file:  cpb-aacip-507-7659c6sk7z.whisper-base.mmif cpb-aacip-507-7659c6sk7z-transcript.txt
Processing file:  cpb-aacip-507-9882j68s35.whisper-base.mmif cpb-aacip-507-9882j68s35-transcript.txt
Processing file:  cpb-aacip-507-cf9j38m509.whisper-base.mmif cpb-aacip-507-cf9j38m509-transcript.txt
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 40, in batch_run_wer
    wer_result_exact_case = calculateWer(hyp_file_path, gold_file_path, True)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 40, in calculateWer
    hyp = process_text(get_text_from_mmif(hyp_file), not exact_case)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 15, in get_text_from_mmif
    annotation_dict = view_dict['annotations'][0]
IndexError: list index out of range
wer_exception list index out of range
Processing file:  cpb-aacip-507-n29p26qt59.whisper-base.mmif cpb-aacip-507-n29p26qt59-transcript.txt
Processing file:  cpb-aacip-507-nk3610wp6s.whisper-base.mmif cpb-aacip-507-nk3610wp6s-transcript.txt
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 40, in batch_run_wer
    wer_result_exact_case = calculateWer(hyp_file_path, gold_file_path, True)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 40, in calculateWer
    hyp = process_text(get_text_from_mmif(hyp_file), not exact_case)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 15, in get_text_from_mmif
    annotation_dict = view_dict['annotations'][0]
IndexError: list index out of range
wer_exception list index out of range
Processing file:  cpb-aacip-507-pc2t43js98.whisper-base.mmif cpb-aacip-507-pc2t43js98-transcript.txt
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 32, in batch_run_wer
    gold_file = next(x for x in gold_files if x.startswith(id))
StopIteration
No matching file found for cpb-aacip-507-pr7mp4wf25.whisper-base.mmif
Processing file:  cpb-aacip-507-r785h7cp0z.whisper-base.mmif cpb-aacip-507-r785h7cp0z-transcript.txt
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 32, in batch_run_wer
    gold_file = next(x for x in gold_files if x.startswith(id))
StopIteration
No matching file found for cpb-aacip-507-v11vd6pz5w.whisper-base.mmif
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 32, in batch_run_wer
    gold_file = next(x for x in gold_files if x.startswith(id))
StopIteration
No matching file found for cpb-aacip-507-v40js9j432.whisper-base.mmif
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 32, in batch_run_wer
    gold_file = next(x for x in gold_files if x.startswith(id))
StopIteration
No matching file found for cpb-aacip-507-vd6nz81n6r.whisper-base.mmif
Processing file:  cpb-aacip-507-vm42r3pt6h.whisper-base.mmif cpb-aacip-507-vm42r3pt6h-transcript.txt
Processing file:  cpb-aacip-507-zk55d8pd1h.whisper-base.mmif cpb-aacip-507-zk55d8pd1h-transcript.txt
Processing file:  cpb-aacip-507-zw18k75z4h.whisper-base.mmif cpb-aacip-507-zw18k75z4h-transcript.txt
Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 32, in batch_run_wer
    gold_file = next(x for x in gold_files if x.startswith(id))
StopIteration
No matching file found for README.md

This is the second error: Error from mmif:

Traceback (most recent call last):
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/batch_asr_eval.py", line 40, in batch_run_wer
    wer_result_exact_case = calculateWer(hyp_file_path, gold_file_path, True)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 40, in calculateWer
    hyp = process_text(get_text_from_mmif(hyp_file), not exact_case)
  File "/mnt/c/Users/J/Documents/WORK/Clams/aapb-evaluations-run/asr_eval/asr_eval.py", line 15, in get_text_from_mmif
    annotation_dict = view_dict['annotations'][0]
IndexError: list index out of range
wer_exception list index out of range
1192119703jzx commented 3 months ago

The bug is fixed in new evaluate.py from the asr_eval_update branch. The IndexError: list index out of range is caused by the factor that certain files in prep data for whisper base only contain error message but no annotation content. The cause of No matching file error is not sure but is fixed in the rewrite evaluate.py.