as-ideas / TransformerTTS

🤖💬 Transformer TTS: Implementation of a non-autoregressive Transformer based neural network for text to speech.
https://as-ideas.github.io/TransformerTTS/
Other
1.13k stars 227 forks source link

inference error #114

Closed sciai-ai closed 3 years ago

sciai-ai commented 3 years ago

Hi

I am getting this error while doing inference for a model I trained

ValueError: Layer #1 (named "Encoder" in the current model) was found to correspond to layer Encoder in the save file. However the new layer Encoder expects 109 weights, but the saved weights have 99 elements. ` Config

`encoder_model_dimension: 384 decoder_model_dimension: 384 dropout_rate: 0.1 decoder_num_heads: [2, 2, 2, 2, 2, 2] # the length of this defines the number of layers encoder_num_heads: [2, 2, 2, 2, 2, 2] # the length of this defines the number of layers encoder_max_position_encoding: 2000 decoder_max_position_encoding: 10000 encoder_dense_blocks: 0 decoder_dense_blocks: 0 duration_conv_filters: [256, 226] pitch_conv_filters: [256, 226] duration_kernel_size: 3 pitch_kernel_size: 3 predictors_dropout: 0.1 mel_channels: 80 phoneme_language: en-us with_stress: true model_breathing: true transposed_attn_convs: true encoder_attention_conv_filters: [1536, 384] decoder_attention_conv_filters: [1536, 384] encoder_attention_conv_kernel: 3 decoder_attention_conv_kernel: 3 encoder_feed_forward_dimension: decoder_feed_forward_dimension: debug: false wav_directory: /home/wavs metadata_path: /home/metadata.txt log_directory: /homelogs/ train_data_directory: transformer_tts_data data_name: nsc audio_settings_name: MelGAN_default text_settings_name: Stress_Breathing aligner_settings_name: alinger_extralayer_layernorm tts_settings_name: tts_swap_conv_dims n_test: 100 mel_start_value: .5 mel_end_value: -.5 max_mel_len: 1_200 min_mel_len: 80 bucket_boundaries: [200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200] # mel bucketing bucket_batch_sizes: [64, 42, 32, 25, 21, 18, 16, 14, 12, 6, 1] val_bucket_batch_size: [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1] sampling_rate: 22050 n_fft: 1024 hop_length: 256 win_length: 1024 f_min: 0 f_max: 8000 normalizer: MelGAN trim_silence_top_db: 60 trim_silence: false trim_long_silences: true vad_window_length: 30 vad_moving_average_width: 8 vad_max_silence_length: 12 vad_sample_rate: 16000 norm_wav: true target_dBFS: -30 int16_max: 32767 learning_rate_schedule:

sciai-ai commented 3 years ago

@cfrancesco your advice will be appreciated, thx!