Open hxs7709 opened 6 years ago
This is a fix for this problem https://github.com/Rayhane-mamah/Tacotron-2/pull/237
hello,have you solve the problem?@hxs7709
hello,i met the same erro ,can you tell me how to solve that?thanks@gloriouskilka
Sorry, I do not solve this issue.
This is a fix for this problem #237
No, @gloriouskilka change didn't fix the problem as I just tried the change. I think the error can be avoided by additionally commenting out the line, but not sure if this is a good fix.
assert len(mels) == len(linears) == len(texts)
The error log is listed below. Thank you.
[hxs@VM_0_7_centos Tacotron-2-Ray-newclips3-trainmodel-T2]$ python synthesize.py --model='Tacotron-2' --mode='live' /home/hxs/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from
float
tonp.floating
is deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type
. from ._conv import register_converters as _register_converters Using TensorFlow backend. synthesize.py:82: UserWarning: Requested a live evaluation with Tacotron-2, Wavenet will not be used! warn('Requested a live evaluation with Tacotron-2, Wavenet will not be used!') Running End-to-End TTS Evaluation. Model: Tacotron-2 Synthesizing mel-spectrograms from text.. loaded model at logs-Tacotron-2/taco_pretrained/tacotron_model.ckpt-85000 Hyperparameters: allow_clipping_in_normalization: True attention_dim: 128 attention_filters: 32 attention_kernel: (31,) cbhg_conv_channels: 128 cbhg_highway_units: 128 cbhg_highwaynet_layers: 4 cbhg_kernels: 8 cbhg_pool_size: 2 cbhg_projection: 256 cbhg_projection_kernel_size: 3 cbhg_rnn_units: 128 cin_channels: 80 cleaners: english_cleaners clip_for_wavenet: True clip_mels_length: True cross_entropy_pos_weight: 20 cumulative_weights: True decoder_layers: 2 decoder_lstm_units: 1024 embedding_dim: 512 enc_conv_channels: 512 enc_conv_kernel_size: (5,) enc_conv_num_layers: 3 encoder_lstm_units: 256 fmax: 7600 fmin: 55 frame_shift_ms: None freq_axis_kernel_size: 3 gate_channels: 256 gin_channels: -1 griffin_lim_iters: 60 hop_size: 275 input_type: raw kernel_size: 3 layers: 20 leaky_alpha: 0.4 log_scale_min: -32.23619130191664 log_scale_min_gauss: -16.11809565095832 mask_decoder: False mask_encoder: True max_abs_value: 4.0 max_iters: 2000 max_mel_frames: 1000 max_time_sec: None max_time_steps: 11000 min_level_db: -100 n_fft: 2048 n_speakers: 5 natural_eval: False normalize_for_wavenet: True num_freq: 1025 num_mels: 80 out_channels: 2 outputs_per_step: 1 postnet_channels: 512 postnet_kernel_size: (5,) postnet_num_layers: 5 power: 1.5 predict_linear: True preemphasis: 0.97 preemphasize: True prenet_layers: [256, 256] quantize_channels: 65536 ref_level_db: 20 rescale: True rescaling_max: 0.999 residual_channels: 128 sample_rate: 22050 signal_normalization: True silence_threshold: 2 skip_out_channels: 128 smoothing: False split_on_cpu: True stacks: 2 stop_at_any: True symmetric_mels: True tacotron_adam_beta1: 0.9 tacotron_adam_beta2: 0.999 tacotron_adam_epsilon: 1e-06 tacotron_batch_size: 32 tacotron_clip_gradients: True tacotron_data_random_state: 1234 tacotron_decay_learning_rate: True tacotron_decay_rate: 0.5 tacotron_decay_steps: 50000 tacotron_dropout_rate: 0.5 tacotron_final_learning_rate: 1e-05 tacotron_gpu_start_idx: 0 tacotron_initial_learning_rate: 0.001 tacotron_num_gpus: 1 tacotron_random_seed: 5339 tacotron_reg_weight: 1e-07 tacotron_scale_regularization: False tacotron_start_decay: 50000 tacotron_swap_with_cpu: False tacotron_synthesis_batch_size: 1 tacotron_teacher_forcing_decay_alpha: 0.0 tacotron_teacher_forcing_decay_steps: 280000 tacotron_teacher_forcing_final_ratio: 0.0 tacotron_teacher_forcing_init_ratio: 1.0 tacotron_teacher_forcing_mode: constant tacotron_teacher_forcing_ratio: 1.0 tacotron_teacher_forcing_start_decay: 10000 tacotron_test_batches: None tacotron_test_size: 0.05 tacotron_zoneout_rate: 0.1 train_with_GTA: False trim_fft_size: 512 trim_hop_size: 128 trim_silence: True trim_top_db: 23 upsample_activation: LeakyRelu upsample_conditional_features: True upsample_scales: [5, 5, 11] upsample_type: 1D use_bias: True use_lws: False use_speaker_embedding: True wavenet_adam_beta1: 0.9 wavenet_adam_beta2: 0.999 wavenet_adam_epsilon: 1e-08 wavenet_batch_size: 8 wavenet_clip_gradients: False wavenet_data_random_state: 1234 wavenet_decay_rate: 0.5 wavenet_decay_steps: 300000 wavenet_dropout: 0.05 wavenet_ema_decay: 0.9999 wavenet_gpu_start_idx: 0 wavenet_init_scale: 1.0 wavenet_learning_rate: 0.0001 wavenet_lr_schedule: exponential wavenet_num_gpus: 1 wavenet_random_seed: 5339 wavenet_swap_with_cpu: False wavenet_synthesis_batch_size: 20 wavenet_test_batches: None wavenet_test_size: 0.0441 wavenet_warmup: 4000.0 wavenet_weight_normalization: False win_size: 1100 Constructing model: Tacotron WARNING:tensorflow:From /home/hxs/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/rnn.py:430: calling reverse_sequence (from tensorflow.python.ops.array_ops) with seq_dim is deprecated and will be removed in a future version. Instructions for updating: seq_dim is deprecated, use seq_axis instead WARNING:tensorflow:From /home/hxs/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py:454: calling reverse_sequence (from tensorflow.python.ops.array_ops) with batch_dim is deprecated and will be removed in a future version. Instructions for updating: batch_dim is deprecated, use batch_axis instead initialisation done /gpu:0 Initialized Tacotron model. Dimensions (? = dynamic shape): Train mode: False Eval mode: False GTA mode: False Synthesis mode: True Input: (?, ?) device: 0 embedding: (?, ?, 512) enc conv out: (?, ?, 512) encoder out: (?, ?, 512) decoder out: (?, ?, 80) residual out: (?, ?, 512) projected residual out: (?, ?, 80) mel out: (?, ?, 80) linear out: (?, ?, 1025)