NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

ValueError: use_language_model parameter has to be specified??? #498

Open MuruganR96 opened 5 years ago

MuruganR96 commented 5 years ago

i was tried Speech2text for w2l_plus_large pretrained model. here the running command:

python3 run.py --mode=infer --config="/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/example_configs/speech2text/w2lplus_large_8gpus_mp.py" --logdir="/home/dell/Murugan_R/speech_recognition/model/w2l_plus_large/" --num_gpus=1 --use_horovod=False --decoder_params/infer_logits_to_pickle=True --infer_output_file=model_output.pickle --batch_size_per_gpu=1

i am getting this issue. how to resolve this sir. _ValueError: use_languagemodel parameter has to be specified

*** Restoring from the latest checkpoint
*** Loading model from /home/dell/Murugan_R/speech_recognition/model/w2l_plus_large/model.ckpt-109800
*** Inference config:
{'batch_size_per_gpu': 1,
 'data_layer': <class 'open_seq2seq.data.speech2text.speech2text.Speech2TextDataLayer'>,
 'data_layer_params': {'dataset_files': ['data/librispeech/librivox-test-clean.csv'],
                       'input_type': 'logfbank',
                       'num_audio_features': 64,
                       'shuffle': False,
                       'vocab_file': 'open_seq2seq/test_utils/toy_speech_data/vocab.txt'},
 'decoder': <class 'open_seq2seq.decoders.fc_decoders.FullyConnectedCTCDecoder'>,
 'decoder_params': {'infer_logits_to_pickle': True,
                    'initializer': <function xavier_initializer at 0x7f1aa152d620>},
 'dtype': 'mixed',
 'encoder': <class 'open_seq2seq.encoders.tdnn_encoder.TDNNEncoder'>,
 'encoder_params': {'activation_fn': <function <lambda> at 0x7f1abfa359d8>,
                    'convnet_layers': [{'dilation': [1],
                                        'dropout_keep_prob': 0.8,
                                        'kernel_size': [11],
                                        'num_channels': 256,
                                        'padding': 'SAME',
                                        'repeat': 1,
                                        'stride': [2],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.8,
                                        'kernel_size': [11],
                                        'num_channels': 256,
                                        'padding': 'SAME',
                                        'repeat': 3,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.8,
                                        'kernel_size': [13],
                                        'num_channels': 384,
                                        'padding': 'SAME',
                                        'repeat': 3,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.8,
                                        'kernel_size': [17],
                                        'num_channels': 512,
                                        'padding': 'SAME',
                                        'repeat': 3,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.7,
                                        'kernel_size': [21],
                                        'num_channels': 640,
                                        'padding': 'SAME',
                                        'repeat': 3,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.7,
                                        'kernel_size': [25],
                                        'num_channels': 768,
                                        'padding': 'SAME',
                                        'repeat': 3,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [2],
                                        'dropout_keep_prob': 0.6,
                                        'kernel_size': [29],
                                        'num_channels': 896,
                                        'padding': 'SAME',
                                        'repeat': 1,
                                        'stride': [1],
                                        'type': 'conv1d'},
                                       {'dilation': [1],
                                        'dropout_keep_prob': 0.6,
                                        'kernel_size': [1],
                                        'num_channels': 1024,
                                        'padding': 'SAME',
                                        'repeat': 1,
                                        'stride': [1],
                                        'type': 'conv1d'}],
                    'data_format': 'channels_last',
                    'dropout_keep_prob': 0.7,
                    'initializer': <function xavier_initializer at 0x7f1aa152d620>,
                    'initializer_params': {'uniform': False},
                    'normalization': 'batch_norm'},
 'eval_steps': 2200,
 'iter_size': 1,
 'larc_params': {'larc_eta': 0.001},
 'load_model': '',
 'logdir': '/home/dell/Murugan_R/speech_recognition/model/w2l_plus_large/',
 'loss': <class 'open_seq2seq.losses.ctc_loss.CTCLoss'>,
 'loss_params': {},
 'loss_scaling': 'Backoff',
 'lr_policy': <function poly_decay at 0x7f1a9b9bbbf8>,
 'lr_policy_params': {'learning_rate': 0.05, 'power': 2.0},
 'num_checkpoints': 5,
 'num_epochs': 200,
 'num_gpus': 1,
 'optimizer': 'Momentum',
 'optimizer_params': {'momentum': 0.9},
 'print_loss_steps': 10,
 'print_samples_steps': 2200,
 'random_seed': 0,
 'regularizer': <function l2_regularizer at 0x7f1aa15bb268>,
 'regularizer_params': {'scale': 0.001},
 'save_checkpoint_steps': 1100,
 'save_summaries_steps': 100,
 'summaries': ['learning_rate',
               'variables',
               'gradients',
               'larc_summaries',
               'variable_norm',
               'gradient_norm',
               'global_gradient_norm'],
 'use_horovod': False,
 'use_xla_jit': False}
Traceback (most recent call last):
  File "run.py", line 104, in <module>
    main()
  File "run.py", line 79, in main
    args, base_config, config_module, base_model, hvd, checkpoint)
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 879, in create_model
    model = base_model(params=infer_config, mode=args.mode, hvd=hvd)
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/models/encoder_decoder.py", line 76, in __init__
    self._decoder = self._create_decoder()
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/models/speech2text.py", line 119, in _create_decoder
    return super(Speech2Text, self)._create_decoder()
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/models/encoder_decoder.py", line 102, in _create_decoder
    return self.params['decoder'](params=params, mode=self.mode, model=self)
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/decoders/fc_decoders.py", line 208, in __init__
    super(FullyConnectedCTCDecoder, self).__init__(params, model, name, mode)
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/decoders/fc_decoders.py", line 103, in __init__
    super(FullyConnectedTimeDecoder, self).__init__(params, model, name, mode)
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/decoders/decoder.py", line 81, in __init__
    check_params(params, self.get_required_params(), self.get_optional_params())
  File "/home/dell/Murugan_R/speech_recognition/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 409, in check_params
    raise ValueError("{} parameter has to be specified".format(pm))
ValueError: use_language_model parameter has to be specified
vsl9 commented 5 years ago

Thank you for reporting the issue. Can you please try with the latest master?

ahmedalbahnasawy commented 4 years ago

use_language_model parameter has to be specified if its true put the path of your language model if it false put nothing