I am getting this error while doing inference for a model I trained
ValueError: Layer #1 (named "Encoder" in the current model) was found to correspond to layer Encoder in the save file. However the new layer Encoder expects 109 weights, but the saved weights have 99 elements.
`
Config
[0, 1.0e-4]
max_steps: 100_000
validation_frequency: 5_000
prediction_frequency: 5_000
weights_save_frequency: 5_000
weights_save_starting_step: 5_000
train_images_plotting_frequency: 1_000
keep_n_weights: 5
keep_checkpoint_every_n_hours: 12
n_steps_avg_losses: [100, 500, 1_000, 5_000] # command line display of average loss values for the last n steps
prediction_start_step: 4_000
text_prediction:
Hi
I am getting this error while doing inference for a model I trained
ValueError: Layer #1 (named "Encoder" in the current model) was found to correspond to layer Encoder in the save file. However the new layer Encoder expects 109 weights, but the saved weights have 99 elements. ` Config
`encoder_model_dimension: 384 decoder_model_dimension: 384 dropout_rate: 0.1 decoder_num_heads: [2, 2, 2, 2, 2, 2] # the length of this defines the number of layers encoder_num_heads: [2, 2, 2, 2, 2, 2] # the length of this defines the number of layers encoder_max_position_encoding: 2000 decoder_max_position_encoding: 10000 encoder_dense_blocks: 0 decoder_dense_blocks: 0 duration_conv_filters: [256, 226] pitch_conv_filters: [256, 226] duration_kernel_size: 3 pitch_kernel_size: 3 predictors_dropout: 0.1 mel_channels: 80 phoneme_language: en-us with_stress: true model_breathing: true transposed_attn_convs: true encoder_attention_conv_filters: [1536, 384] decoder_attention_conv_filters: [1536, 384] encoder_attention_conv_kernel: 3 decoder_attention_conv_kernel: 3 encoder_feed_forward_dimension: decoder_feed_forward_dimension: debug: false wav_directory: /home/wavs metadata_path: /home/metadata.txt log_directory: /homelogs/ train_data_directory: transformer_tts_data data_name: nsc audio_settings_name: MelGAN_default text_settings_name: Stress_Breathing aligner_settings_name: alinger_extralayer_layernorm tts_settings_name: tts_swap_conv_dims n_test: 100 mel_start_value: .5 mel_end_value: -.5 max_mel_len: 1_200 min_mel_len: 80 bucket_boundaries: [200, 300, 400, 500, 600, 700, 800, 900, 1000, 1200] # mel bucketing bucket_batch_sizes: [64, 42, 32, 25, 21, 18, 16, 14, 12, 6, 1] val_bucket_batch_size: [6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 1] sampling_rate: 22050 n_fft: 1024 hop_length: 256 win_length: 1024 f_min: 0 f_max: 8000 normalizer: MelGAN trim_silence_top_db: 60 trim_silence: false trim_long_silences: true vad_window_length: 30 vad_moving_average_width: 8 vad_max_silence_length: 12 vad_sample_rate: 16000 norm_wav: true target_dBFS: -30 int16_max: 32767 learning_rate_schedule: