NVIDIA / OpenSeq2Seq

Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP
https://nvidia.github.io/OpenSeq2Seq
Apache License 2.0
1.54k stars 369 forks source link

Error when running Tacotron GST with interactive_inference #376

Closed oscarpang closed 5 years ago

oscarpang commented 5 years ago

It seems that the Tacotron GST doesn't support interactive inference.

Here model.py:395, when we do interactive mode, the code will call the create_interactive_placeholders() instead of the build_graph().

However, Tacotron GST requires "mel_spec" and "spec_length" input_tensors['source_tensors'] list, which were built here data/text2speech.py:366 in the build_graph() function.

So when running in interactive mode, the code will prompt an error when it runs this line

File "t2s_interactive_run.py", line 36, in <module> model_T2S, checkpoint_T2S = get_model(args_T2S, "T2S") File "t2s_interactive_run.py", line 33, in get_model args, base_config, config_module, base_model, None) File "/xxxx/OpenSeq2Seq/open_seq2seq/utils/utils.py", line 790, in create_model model.compile(checkpoint=checkpoint) File "/xxxx/OpenSeq2Seq/open_seq2seq/models/model.py", line 411, in compile gpu_id=gpu_cnt File "/xxxx/OpenSeq2Seq/open_seq2seq/models/encoder_decoder.py", line 157, in _build_forward_pass_graph encoder_output = self.encoder.encode(input_dict=encoder_input) File "/xxxx/OpenSeq2Seq/open_seq2seq/encoders/encoder.py", line 138, in encode return self._encode(self._cast_types(input_dict)) File "/xxxx/OpenSeq2Seq/open_seq2seq/encoders/tacotron2_encoder.py", line 163, in _encode style_spec = input_dict['source_tensors'][2]

blisc commented 5 years ago

It does not support interactive inference. The tacotron gst config does not have interactive_infer_params. But it will be fairly easy for you to add such functionality if you want.