as-ideas / TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
https://as-ideas.github.io/TransformerTTS/
Other
1.13k stars 227 forks source link

model cannot predict #80

Closed sciai-ai closed 3 years ago

sciai-ai commented 3 years ago

Hi I trained a wavernn autoregressive model, training seems fine and i can load the model at the latest checkpoint, however upon running

out = model.predict('Hello')

I get NotFoundError: No algorithm worked! [[node Postnet/cnn_res_norm_7/conv1d_35/conv1d (defined at /home/jupyter/TransformerTTS/model/layers.py:36) ]] [Op:inferenceforward_decoder_66309]

`WARNING:tensorflow:8 out of the last 10 calls to <bound method AutoregressiveTransformer._forward_encoder of <model.models.AutoregressiveTransformer object at 0x7f01de78d250>> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details. [WARNING] 8 out of the last 10 calls to <bound method AutoregressiveTransformer._forward_encoder of <model.models.AutoregressiveTransformer object at 0x7f01de78d250>> triggered tf.function retracing. Tracing is expensive and the excessive number of tracings could be due to (1) creating @tf.function repeatedly in a loop, (2) passing tensors with different shapes, (3) passing Python objects instead of tensors. For (1), please define your @tf.function outside of the loop. For (2), @tf.function has experimental_relax_shapes=True option that relaxes argument shapes that can avoid unnecessary retracing. For (3), please refer to https://www.tensorflow.org/guide/function#controlling_retracing and https://www.tensorflow.org/api_docs/python/tf/function for more details.

NotFoundError Traceback (most recent call last)

in ----> 1 out = model.predict('Hello') 2 3 # Convert spectrogram to wav (with griffin lim) ~/TransformerTTS/model/models.py in predict(self, inp, max_length, encode, verbose) 237 encoder_output, padding_mask, encoder_attention = self.forward_encoder(inp) 238 for i in range(int(max_length // self.r) + 1): --> 239 model_out = self.forward_decoder(encoder_output, output, padding_mask) 240 output = tf.concat([output, model_out['final_output'][:1, -1:, :]], axis=-2) 241 output_concat = tf.concat([tf.cast(output_concat, tf.float32), model_out['final_output'][:1, -self.r:, :]], /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds) 826 tracing_count = self.experimental_get_tracing_count() 827 with trace.Trace(self._name) as tm: --> 828 result = self._call(*args, **kwds) 829 compiler = "xla" if self._experimental_compile else "nonXla" 830 new_tracing_count = self.experimental_get_tracing_count() /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds) 886 # Lifting succeeded, so variables are initialized and we can run the 887 # stateless function. --> 888 return self._stateless_fn(*args, **kwds) 889 else: 890 _, _, _, filtered_flat_args = \ /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs) 2941 filtered_flat_args) = self._maybe_define_function(args, kwargs) 2942 return graph_function._call_flat( -> 2943 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access 2944 2945 @property /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager) 1917 # No tape is watching; skip to running the function. 1918 return self._build_call_outputs(self._inference_function.call( -> 1919 ctx, args, cancellation_manager=cancellation_manager)) 1920 forward_backward = self._select_forward_and_backward_functions( 1921 args, /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager) 558 inputs=args, 559 attrs=attrs, --> 560 ctx=ctx) 561 else: 562 outputs = execute.execute_with_cancellation( /opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name) 58 ctx.ensure_initialized() 59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name, ---> 60 inputs, attrs, num_outputs) 61 except core._NotOkStatusException as e: 62 if name is not None: NotFoundError: No algorithm worked! [[node Postnet/cnn_res_norm_7/conv1d_35/conv1d (defined at /home/jupyter/TransformerTTS/model/layers.py:36) ]] [Op:__inference__forward_decoder_66309] Function call stack: _forward_decoder` The autoregressive config is ``` # ARCHITECTURE decoder_model_dimension: 256 encoder_model_dimension: 512 decoder_num_heads: [4, 4, 4, 4] # the length of this defines the number of layers encoder_num_heads: [4, 4, 4, 4] # the length of this defines the number of layers encoder_feed_forward_dimension: 1024 decoder_feed_forward_dimension: 1024 decoder_prenet_dimension: 256 encoder_prenet_dimension: 512 encoder_attention_conv_filters: 512 decoder_attention_conv_filters: 512 encoder_attention_conv_kernel: 3 decoder_attention_conv_kernel: 3 encoder_max_position_encoding: 1000 decoder_max_position_encoding: 10000 postnet_conv_filters: 256 postnet_conv_layers: 5 postnet_kernel_size: 5 encoder_dense_blocks: 4 decoder_dense_blocks: 4 # LOSSES stop_loss_scaling: 8 # TRAINING dropout_rate: 0.1 decoder_prenet_dropout_schedule: - [0, 0.] - [25_000, 0.] - [35_000, .5] learning_rate_schedule: - [0, 1.0e-4] head_drop_schedule: # head-level dropout: how many heads to set to zero at training time - [0, 0] - [15_000, 1] reduction_factor_schedule: - [0, 10] - [80_000, 5] - [150_000, 3] - [250_000, 1] max_steps: 900_000 bucket_boundaries: [200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200] # mel bucketing bucket_batch_sizes: [64, 42, 32, 25, 21, 18, 16, 14, 12, 11, 1] debug: false # LOGGING validation_frequency: 1_000 prediction_frequency: 10_000 weights_save_frequency: 10_000 train_images_plotting_frequency: 1_000 keep_n_weights: 2 keep_checkpoint_every_n_hours: 12 n_steps_avg_losses: [100, 500, 1_000, 5_000] # command line display of average loss values for the last n steps n_predictions: 2 # autoregressive predictions take time prediction_start_step: 20_000 audio_start_step: 40_000 audio_prediction_frequency: 10_000 # converting to glim takes time git_hash: e4ded5b ```