facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.19k stars 6.37k forks source link

'dict' object has no attribute 'task' while decoding the language model #4450

Open mukherjeesougata opened 2 years ago

mukherjeesougata commented 2 years ago

When i'm trying to run the inference of the finetuned model along with transformer based language model which i'v trained i'm getting this error:

INFO:__main__:| decoding with criterion ctc
INFO:__main__:| loading model(s) from /data/Sougata/vakyansh-wav2vec2-experimentation/checkpoints/finetuning/checkpoint_best.pt
INFO:fairseq.data.audio.raw_audio_dataset:loaded 3012, skipped 0 samples
INFO:__main__:| /data/Sougata/vakyansh-wav2vec2-experimentation/data/inference/IITDH_ENG test 3012 examples
Traceback (most recent call last):
  File "../../wav2vec/fairseq/examples/speech_recognition/infer.py", line 427, in <module>
    cli_main()
  File "../../wav2vec/fairseq/examples/speech_recognition/infer.py", line 423, in cli_main
    main(args)
  File "../../wav2vec/fairseq/examples/speech_recognition/infer.py", line 283, in main
    generator = build_generator(args)
  File "../../wav2vec/fairseq/examples/speech_recognition/infer.py", line 276, in build_generator
    return W2lFairseqLMDecoder(args, task.target_dictionary)
  File "/data/Sougata/odia_asr/vakyansh-wav2vec2-experimentation/wav2vec/fairseq/examples/speech_recognition/w2l_decoder.py", line 383, in __init__
    with open_dict(lm_args.task):
AttributeError: 'dict' object has no attribute 'task'

The command I am using is a folows:-

python ../../wav2vec/fairseq/examples/speech_recognition/infer.py ${data_path} --task audio_pretraining --nbest 1 --path \
${parentdir}/checkpoints/finetuning/checkpoint_best.pt --gen-subset test --results-path ${result_path} --w2l-decoder fairseqlm \
--lm-model ${parentdir}/lm/${lm_name}/checkpoint_best.pt --lm-weight 2 --word-score -1 --sil-weight 0 --criterion ctc --labels ltr --max-tokens 4000000\
 --lexicon ${parentdir}/lm/${lm_name}/lexicon.lst --post-process letter --beam ${beam} \
--model-overrides "{'w2v_path':'${pretrained_model_path}'}"

I have prepared the language model using the following link:- https://towardsdatascience.com/implementing-transformer-for-language-modeling-ba5dd60389a2

I have tried what is given in the following link:- https://github.com/facebookresearch/fairseq/issues/3488

Environment:

SeunghyunSEO commented 2 years ago

It looks like lm_args instance is python Dictionary, not Omegaconf type.

In my case, I just added below 3 lines above lm_args

        if "cfg" in checkpoint and checkpoint["cfg"] is not None:
            lm_args = checkpoint["cfg"]
        else:
            lm_args = convert_namespace_to_omegaconf(checkpoint["args"])

        from omegaconf import OmegaConf
        if type(lm_args) is dict:
            lm_args = OmegaConf.create(lm_args)

Then you will get OmegaConf type instance, not python Dictionary.

Btw, I recommend you to use this new directory for decoding. Its almost same with fairseq/examples/speech_recognition/infer.py, but this directory is latest version of decoder and it supports distributed inference (more faster if you have enough gpu). Try it!

mukherjeesougata commented 2 years ago

`

    from omegaconf import OmegaConf
    if type(lm_args) is dict:
        lm_args = OmegaConf.create(lm_args)`

I have added the above 3 lines and checked,it is giving the following error:- Traceback (most recent call last): File "examples/speech_recognition/infer.py", line 440, in <module> cli_main() File "examples/speech_recognition/infer.py", line 435, in cli_main main(args) File "examples/speech_recognition/infer.py", line 238, in main state=model_state, File "/data/Sougata/vakyansh-wav2vec2-experimentation/wav2vec/fairseq/fairseq/checkpoint_utils.py", line 370, in load_model_ensemble_and_task state = load_checkpoint_to_cpu(filename, arg_overrides) File "/data/Sougata/vakyansh-wav2vec2-experimentation/wav2vec/fairseq/fairseq/checkpoint_utils.py", line 293, in load_checkpoint_to_cpu old_primitive = _utils.is_primitive_type AttributeError: module 'omegaconf._utils' has no attribute 'is_primitive_type'

I have also used this new directory for decoding. It is showing the following error:- Traceback (most recent call last): File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/hydra/_internal/utils.py", line 198, in run_and_report return func() File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/hydra/_internal/utils.py", line 358, in <lambda> overrides=args.overrides, File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 136, in multirun return sweeper.sweep(arguments=task_overrides) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/hydra/_internal/core_plugins/basic_sweeper.py", line 139, in sweep sweep_dir = Path(self.config.hydra.sweep.dir) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 305, in __getattr__ self._format_and_raise(key=key, value=None, cause=e) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/base.py", line 101, in _format_and_raise type_override=type_override, File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/_utils.py", line 629, in format_and_raise _raise(ex, cause) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 303, in __getattr__ return self._get_impl(key=key, default_value=DEFAULT_VALUE_MARKER) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 368, in _get_impl key=key, value=node, default_value=default_value File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/basecontainer.py", line 65, in _resolve_with_default throw_on_resolution_failure=not has_default, File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/base.py", line 392, in _resolve_interpolation throw_on_resolution_failure=throw_on_resolution_failure, File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/base.py", line 347, in _resolve_simple_interpolation self._format_and_raise(key=inter_key, value=None, cause=e) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/base.py", line 101, in _format_and_raise type_override=type_override, File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/_utils.py", line 694, in format_and_raise _raise(ex, cause) File "/data/anaconda3/envs/newenv/lib/python3.7/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace omegaconf.errors.ValidationError: Environment variable 'PREFIX' not found full_key: hydra.sweep.PREFIX reference_type=SweepDir object_type=SweepDir