syang1993 / gst-tacotron

A tensorflow implementation of the "Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis"
368 stars 110 forks source link

Check failed: dnnReLUCreateBackward_F32 #40

Open miyoungvkim opened 4 years ago

miyoungvkim commented 4 years ago

Hello :D

I'm trying to use gst-tacotron with blizzardchallenge2013 datasets.

When I try to training, I met check failed error. (It same to use gst true option..)

so, I ask about below...

I just try to training so, I didn't change base code. May I get some idea for solve this problem??

Here is my log


1. when I use gst_false option .. gst-tacotron_gst_false# python train.py /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:493: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:494: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:495: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:496: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:497: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /root/anaconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:502: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Checkpoint path: /data/workspace/blizzardchllenge/gst-tacotron_gst_false/logs-tacotron/model.ckpt Loading training data from: /data/workspace/blizzardchllenge/gst-tacotron_gst_false/training/train.txt Using model: tacotron Hyperparameters: adam_beta1: 0.9 adam_beta2: 0.999 attention_depth: 256 batch_size: 32 cleaners: english_cleaners decay_learning_rate: True embed_depth: 256 encoder_depth: 256 frame_length_ms: 50 frame_shift_ms: 12.5 griffin_lim_iters: 60 initial_learning_rate: 0.002 max_iters: 1000 min_level_db: -100 num_freq: 1025 num_gst: 10 num_heads: 4 num_mels: 80 outputs_per_step: 2 power: 1.5 preemphasis: 0.97 prenet_depths: [256, 128] ref_level_db: 20 reference_depth: 128 reference_filters: [32, 32, 64, 64, 128, 128] rnn_depth: 256 sample_rate: 16000 style_att_dim: 128 style_att_type: mlp_attention style_embed_depth: 256 use_cmudict: False use_gst: False Loaded metadata for 9725 examples (20.13 hours) Initialized Tacotron model. Dimensions: text embedding: 256 style embedding: 128 prenet out: 128 encoder out: 384 attention out: 256 concat attn & out: 640 decoder cell out: 256 decoder out (2 frames): 160 decoder out (1 frame): 80 postnet out: 256 linear out: 1025 Starting new training run at commit: None Generated 32 batches of size 32 in 90.126 sec Step 1 [139.284 sec/step, loss=0.87672, avg_loss=0.87672] Step 2 [130.008 sec/step, loss=0.97632, avg_loss=0.92652] Step 3 [141.618 sec/step, loss=0.98165, avg_loss=0.94490] Step 4 [194.484 sec/step, loss=0.99856, avg_loss=0.95831] Step 5 [177.694 sec/step, loss=0.95613, avg_loss=0.95788] 2019-12-03 09:52:09.674825: F tensorflow/core/kernels/mkl_relu_op.cc:328] Check failed: dnnReLUCreateBackward_F32(&mkl_context.prim_relu_bwd, __null, mkl_context.lt_grad, mkl_context.lt_grad, negative_slope) == E_SUCCESS (-1 vs. 0) Aborted (core dumped)


2. when I use gst_true option ..

gst-tacotron_gst_true/training/train.txt Using model: tacotron Hyperparameters: adam_beta1: 0.9 adam_beta2: 0.999 attention_depth: 256 batch_size: 32 cleaners: english_cleaners decay_learning_rate: True embed_depth: 256 encoder_depth: 256 frame_length_ms: 50 frame_shift_ms: 12.5 griffin_lim_iters: 60 initial_learning_rate: 0.002 max_iters: 1000 min_level_db: -100 num_freq: 1025 num_gst: 10 num_heads: 4 num_mels: 80 outputs_per_step: 2 power: 1.5 preemphasis: 0.97 prenet_depths: [256, 128] ref_level_db: 20 reference_depth: 128 reference_filters: [32, 32, 64, 64, 128, 128] rnn_depth: 256 sample_rate: 16000 style_att_dim: 128 style_att_type: mlp_attention style_embed_depth: 256 use_cmudict: False use_gst: True Loaded metadata for 9725 examples (20.13 hours) WARNING:tensorflow:From /data/workspace/blizzardchllenge/gst-tacotron_gst_true/models/multihead_attention.py:114: calling reduce_sum (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead Initialized Tacotron model. Dimensions: text embedding: 256 style embedding: 256 prenet out: 128 encoder out: 512 attention out: 256 concat attn & out: 768 decoder cell out: 256 decoder out (2 frames): 160 decoder out (1 frame): 80 postnet out: 256 linear out: 1025 Starting new training run at commit: None Generated 32 batches of size 32 in 2.063 sec Step 1 [190.721 sec/step, loss=0.87613, avg_loss=0.87613] Step 2 [105.134 sec/step, loss=0.78472, avg_loss=0.83042] Step 3 [87.687 sec/step, loss=0.86729, avg_loss=0.84271] Step 4 [81.866 sec/step, loss=0.88327, avg_loss=0.85285] Step 5 [73.656 sec/step, loss=0.85281, avg_loss=0.85284] Step 6 [76.789 sec/step, loss=0.87447, avg_loss=0.85645]

2019-12-03 10:45:01.889230: F tensorflow/core/kernels/mkl_relu_op.cc:328] Check failed: dnnReLUCreateBackward_F32(&mkl_context.prim_relu_bwd, __null, mkl_context.lt_grad, mkl_context.lt_grad, negative_slope) == E_SUCCESS (-1 vs. 0) Aborted (core dumped)


thank you :D

syang1993 commented 4 years ago

Hi, I guess it is all because of your environment. It's from the MKL. Since this code is from TensorFlow 1.6, which is a little out-of-data and very different from current version of TensorFlow.