asappresearch / slue-toolkit

A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. Official website: https://asappresearch.github.io/slue-toolkit/
https://asappresearch.github.io/slue-toolkit/
MIT License
62 stars 16 forks source link

Clean up and fix E2E NER baselines #12

Closed fwu-asapp closed 2 years ago

fwu-asapp commented 2 years ago
sshon-asapp commented 2 years ago

Everything looks good to me. If @ankitapasad agree, I can merge.

ankitapasad commented 2 years ago

I have added a few comments, the rest all look good to me. Thanks for cleaning it up!

@fwu-asapp Since you have added create_dict we can also delete slue_toolkit/label_map_files/ner.dict.ltr.txt, right? And thus the whole directory slue_toolkit/label_map_files can go if we are doing away with other pkl files.

fwu-asapp commented 2 years ago

I have added a few comments, the rest all look good to me. Thanks for cleaning it up!

@fwu-asapp Since you have added create_dict we can also delete slue_toolkit/label_map_files/ner.dict.ltr.txt, right? And thus the whole directory slue_toolkit/label_map_files can go if we are doing away with other pkl files.

Thanks, I've confirmed that the generated dictionary is the same as the one there, so I removed it.

sshon-asapp commented 2 years ago

Merged and deleted since this branch is temporal cleanup branch.

pushkalkatara commented 2 years ago

Hi, I guess this commit breaks the download_dataset.sh bash script as it uses the pickle file.

I'm getting this error while using the script:

creating segmented audios in datasets/slue-voxceleb/dev: 100%|█| 954/954 [00:
creating segmented audios in datasets/slue-voxceleb/fine-tune: 100%|█| 5729/5
creating segmented audios in datasets/slue-voxceleb/test: 100%|█| 4052/4052 [
Traceback (most recent call last):
  File "slue_toolkit/prepare/prepare_voxpopuli.py", line 118, in <module>
    fire.Fire()
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "slue_toolkit/prepare/prepare_voxpopuli.py", line 93, in create_manifest
    wrd_str, ltr_str = data_utils.prep_e2e_ner_files(
  File "/root/pushkal/slue/slue-toolkit/slue_toolkit/prepare/data_utils.py", line 161, in prep_e2e_ner_files
    entity_to_spl_char = load_pkl(
  File "/root/pushkal/slue/slue-toolkit/slue_toolkit/prepare/data_utils.py", line 7, in load_pkl
    with open(fname, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'slue_toolkit/label_map_files/raw_entity_to_spl_char.pkl'
Traceback (most recent call last):
  File "slue_toolkit/prepare/create_dict.py", line 18, in <module>
    fire.Fire(create_dict)
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "slue_toolkit/prepare/create_dict.py", line 7, in create_dict
    with open(input) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'manifest/slue-voxpopuli/fine-tune.ltr'
Traceback (most recent call last):
  File "slue_toolkit/prepare/create_dict.py", line 18, in <module>
    fire.Fire(create_dict)
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/root/miniconda3/envs/slue/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "slue_toolkit/prepare/create_dict.py", line 7, in create_dict
    with open(input) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'manifest/slue-voxpopuli/fine-tune.wrd'
fwu-asapp commented 2 years ago

Hi @pushkalkatara, thanks for pointing it out. It should be fixed in this PR #14.