facebookresearch / fairseq

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
MIT License
30.37k stars 6.4k forks source link

TypeError: object of type 'NoneType' has no len() #3480

Open roboticsai opened 3 years ago

roboticsai commented 3 years ago

I have trained a wav2vec2 model on custum_dataset of english language. In the model training process i have stoped the training in middle by using ctrn+c after 1 day. After training the model it had created 2 models checkpoint_best.pt checkpoint_last.pt. After this i fine tuned the checkpoint_best.pt with the same database used for training using base_10m.yaml. Here also i have stoped the fine tuning process after 10m. Now when i try to infering the model checkpoint_best.pt i'm geteting this error:

INFO:__main__:| decoding with criterion ctc
INFO:__main__:| loading model(s) from /root/dataset/custum_speech/models/checkpoint_best.pt
INFO:fairseq.data.audio.raw_audio_dataset:loaded 41, skipped 0 samples
INFO:__main__:| /root/dataset/custum_speech train 41 examples
Traceback (most recent call last):
  File "examples/speech_recognition/infer.py", line 427, in <module>
    cli_main()
  File "examples/speech_recognition/infer.py", line 423, in cli_main
    main(args)
  File "examples/speech_recognition/infer.py", line 283, in main
    generator = build_generator(args)
  File "examples/speech_recognition/infer.py", line 268, in build_generator
    return W2lViterbiDecoder(args, task.target_dictionary)
  File "/root/fairseq/examples/speech_recognition/w2l_decoder.py", line 115, in __init__
    super().__init__(args, tgt_dict)
  File "/root/fairseq/examples/speech_recognition/w2l_decoder.py", line 51, in __init__
    self.vocab_size = len(tgt_dict)
TypeError: object of type 'NoneType' has no len()

How do i solve this issue? is their anyway to modify the configuration file for training and finetuning such that i do't have to stop the process manually? Is it ok to force stop the process manuallY?

RuABraun commented 3 years ago

For anyone else who comes across this now after updating fairseq, this can happen because no when using a finetuned model one should set the task as audio_finetuning

nguyenvulong commented 9 months ago

For those who are looking for a code to change the task. In the code below I changed the task from audio_pretraining to audio_finetuning.

        model_override_rules = {}
        model_override_rules['task'] = {'_name': 'audio_finetuning'}
        cp_path = os.path.join(BASE_DIR,'pretrained/w2v-large.pt')
        model, cfg, task = fairseq.checkpoint_utils.load_model_ensemble_and_task([cp_path], arg_overrides=model_override_rules)