Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation
MIT License
2.28k stars 904 forks source link

Fast evaluation question with no GPU Currently on hand #11

Closed butterl closed 6 years ago

butterl commented 6 years ago

I'm using the default training dataset(LJSpeech-1.1) and got no GPU on my machine .The CPU training speed is about 7.182 sec/step, for fast envaluation ,how many steps do you think it would be OK to see different(big enough performance different) between the other TTS?

And since the training is running, is it OK is pause(ctrl+c) when I want to stop and envalute ?

Rayhane-mamah commented 6 years ago

Hello again @butterl, thanks for your interest in this project.

I am in the same spot as you for now and being the impatient person I am, I didn't settle for a slower convergence speed that the actual.. Accoding to this comment, output should become audible in about 5k steps. (Natural synthesis with no teacher forcing!). Actually as soon as the alignments become right (3k steps), the audio become audible. All steps after that are to correct pronunciation details (e.g: "I" and "Y" in the previously linked comment).

Since I don't have any GPU available currently, I didn't push the training further than 8k steps. Audio quality is slightly better than the primary samples but there are still some pronunciation issues.

So, I can't really say for sure how many steps are needed, since at this point the model enters the "slow learning" part, my titan XP should however be here by the end of the week, I'll tell you how it goes (and probably put some pretrained Ljspeech and M-AILABS en_US models).

You can perfectly stop the training whenever you like (ctrl+c) and come back to it later. For evaluation, there is the logs-Tacotron folder which contains training logs if you are concerned about audio quality. There are also the mel and alignment plots which can help you judge the state of the model. As if to do some synthesis, maybe on totally different sentences from those in the training data, you can run the synthesize.py with --mode='eval' and it will evaluate on the test sentences inside hparams.py.

Note:

If there is anything else, I can assist you with, please let me know!

butterl commented 6 years ago

Thanks Rayhane for your reply! Will try and dig more into it. I'm also interested if I could use a Chinese database to train the model (if same format with Lj Speech) and test on phone(TTS and Talkback has bad user experience )

I've applied a free trial of Google cloud(1 year & 300$ free) and wonder if this project could be trained with TPU with some patches to make faster envaluation.

Cool work! :+1:

imdatceleste commented 6 years ago

@butterl: from what I could gather, you can use Chinese if you have the data. Make sure that you adapt your "symbols.py" in tacotron/utilities accordingly. Also, theoretically, this should work on Google TPUs, you need the right TensorFlow 1.6 on Google's TPU-Cloud. From what I've seen in the code, there is no reason why it should not. But make sure that your TPUs support the right data type.

Rayhane-mamah commented 6 years ago

Hello again,

Like @imdatsolak said, you needd to update your symbols.py to contain the correct tokens,

As for the google TPU I never used it directly so I don't really want to mislead you. And keep in mind that you need tensorflow v1.6 for the implementation to work.

For the "test on phone" part, I suggest you mean that you want to use the final trained model in an app phone? Tensorflow has the mobile porting option indeed, but if you're planning to use the entire model (+wavenet) it might be not possible due to the computation amount and most importantly the generation speed (Wavenet is reported to generate 1sec of audio in 20 mins if I remember right?). So for a real time synthesis, you will have to pass on the Wavenet, maybe add the post processing network to predict linear frames directly and use that for your mobile app. I'll try adding this part as soon as possible.

butterl commented 6 years ago

@imdatsolak thanks for your instructions , I checked code and find @begeekmyfriend already has an implementation with Chinese, will check his work and do some patch work.

@Rayhane-mamah thanks again for your help, for phone test, I want to use the trained model to run with tensorflow lite to improved TTS like reading(e.g. Talkback in Android not act so frindly to build accessibility). The generation speed you mention is that using Wavenet to generate would cause much even with GPU? I heard that some Soc will open it's NPU API, think this would speed it up. Also appreciate your plan to build the post processing network, google‘s example seems already done that.

For TPU usage, I will try it in my part time, but not sure when to finish it , will post here as soon as I have boot it up. :)

butterl commented 6 years ago

From my test (with default LJ Speech-1.1 ) the loss value has fluctuate a lot in several steps (from 0.7x to 1.3x),although the loss value still slow convergence but may affect the train speed a lot. (@Rayhane-mamah sorry not able to upload files in office and the log is to big to attach in txt,I cut one of the abnormal data sequences and the plot script to show the change)

store below data to data.csv

12250,0.74109,0.73348
12251,0.73853,0.73361
12252,0.72595,0.73335
12253,0.72529,0.73331
12254,0.74426,0.73333
12255,0.75689,0.73340
12256,0.75478,0.73353
12257,0.74107,0.73330
12258,0.75057,0.73328
12259,0.74394,0.73322
12260,0.73545,0.73298
12261,0.72712,0.73319
12262,0.72780,0.73331
12263,0.74001,0.73332
12264,0.76320,0.73352
12265,0.73523,0.73355
12266,0.74463,0.73351
12267,0.67656,0.73292
12268,0.74256,0.73289
12269,0.73237,0.73270
12270,0.72164,0.73252
12271,0.70983,0.73223
12272,0.74273,0.73225
12273,0.74767,0.73242
12274,0.74155,0.73220
12275,0.74957,0.73225
12276,0.66277,0.73150
12277,0.75349,0.73164
12278,0.74867,0.73194
12279,0.72937,0.73185
12280,0.76297,0.73196
12281,0.75345,0.73228
12282,0.73984,0.73235
12283,0.62488,0.73111
12284,0.75658,0.73122
12285,0.73920,0.73126
12286,0.75598,0.73152
12287,0.73710,0.73141
12288,0.74675,0.73286
12289,0.74327,0.73291
12290,0.74397,0.73289
12291,0.69565,0.73233
12292,0.77596,0.73253
12293,0.74384,0.73260
12294,0.73542,0.73281
12295,0.75320,0.73321
12296,0.73412,0.73310
12297,0.74946,0.73319
12298,0.74982,0.73442
12299,0.73836,0.73422
12300,0.71924,0.73403
12301,0.70941,0.73364
12302,0.73989,0.73366
12303,0.72781,0.73343
12304,0.75809,0.73357
12305,0.78042,0.73428
12306,0.91850,0.73621
12307,1.03897,0.73915
12308,1.36345,0.74548
12309,1.67591,0.75500
12310,1.93052,0.76694
12311,1.77411,0.77720
12312,1.84395,0.78816
12313,2.00760,0.80079
12314,1.79825,0.81129
12315,1.72120,0.82113
12316,1.73871,0.83121
12317,1.68478,0.84106
12318,1.61951,0.84997
12319,1.61236,0.85873
12320,1.57311,0.86704
12321,1.57022,0.87544
12322,1.52013,0.88327
12323,1.37010,0.89001
12324,1.94269,0.90202
12325,1.54544,0.91006
12326,1.68023,0.91934
12327,1.71184,0.92947
12328,1.59535,0.93803
12329,1.57655,0.94633
12330,1.51806,0.95427
12331,1.52102,0.96300
12332,1.48102,0.97054
12333,1.49190,0.97815
12334,1.47639,0.98558
12335,1.43791,0.99270
12336,1.45838,1.00003
12337,1.45684,1.00728
12338,1.43520,1.01424
12339,1.40723,1.02097
12340,1.40795,1.02826
12341,1.38779,1.03475
12342,1.40307,1.04138
12343,1.37772,1.04765
12344,1.35219,1.05368
12345,1.36081,1.06006
12346,1.35203,1.06618
12347,1.32067,1.07208
12348,1.34777,1.07814
12349,1.31385,1.08391
12350,1.31930,1.08969
12351,1.32886,1.09559
12352,1.32329,1.10156
12353,1.31333,1.10745
12354,1.32246,1.11323
12355,1.27946,1.11845
12356,1.27480,1.12365
12357,1.29324,1.12917
12358,1.25136,1.13418
12359,1.31275,1.13987
12360,1.26552,1.14517
12361,1.13800,1.14928
12362,1.30461,1.15505
12363,1.29316,1.16058
12364,1.27282,1.16568
12365,1.26426,1.17097
12366,1.30422,1.17656
12367,1.25095,1.18231
12368,1.24353,1.18732
12369,1.26752,1.19267
12370,1.28041,1.19826
12371,1.28820,1.20404
12372,1.25535,1.20917
12373,1.26001,1.21429
12374,1.25796,1.21945
12375,1.09220,1.22288
12376,1.27313,1.22898
12377,1.24978,1.23395
12378,1.24255,1.23888
12379,1.24906,1.24408
12380,1.22833,1.24873
12381,1.23014,1.25350
12382,1.22875,1.25839
12383,1.22675,1.26441
12384,1.22715,1.26911
12385,1.21221,1.27385
12386,1.23979,1.27868
12387,1.21472,1.28346
12388,1.18623,1.28785
12389,1.22113,1.29263
12390,1.21397,1.29733
12391,1.20445,1.30242
12392,1.20962,1.30676
12393,1.20389,1.31136
12394,1.21285,1.31613
12395,1.21185,1.32072
12396,1.20404,1.32542
12397,1.20917,1.33002
12398,1.18783,1.33440
12399,1.18989,1.33891
12400,1.18953,1.34361
12401,1.17633,1.34828
12402,1.21905,1.35307
12403,1.18392,1.35764
12404,1.18545,1.36191
12405,1.18680,1.36597
12406,1.16777,1.36847
12407,0.99445,1.36802
12408,1.17104,1.36610
12409,1.20440,1.36138
12410,1.19345,1.35401
12411,1.18887,1.34816
12412,1.19370,1.34166
12413,1.18620,1.33344
12414,1.20226,1.32748
12415,1.09833,1.32125
12416,1.18701,1.31574
12417,1.16918,1.31058
12418,1.15732,1.30596
12419,1.17181,1.30155
12420,1.16267,1.29745
12421,1.18618,1.29361
12422,1.14834,1.28989
12423,1.17757,1.28796
12424,1.15672,1.28010
12425,1.18112,1.27646
12426,1.14057,1.27106
12427,1.18143,1.26576
12428,1.12721,1.26108
12429,1.16463,1.25696
12430,1.19363,1.25372
12431,1.15771,1.25008
12432,1.15825,1.24685
12433,1.15697,1.24351
12434,1.15160,1.24026
12435,1.17041,1.23758
12436,1.15369,1.23454
12437,1.15004,1.23147
12438,1.13178,1.22843
12439,1.16746,1.22604
12440,1.14602,1.22342
12441,0.97995,1.21934
12442,1.16239,1.21693
12443,1.19131,1.21507
12444,1.15911,1.21314
12445,1.14445,1.21097
12446,1.15769,1.20903
12447,1.15379,1.20736
12448,1.16271,1.20551
12449,1.13413,1.20371
12450,1.15752,1.20209
12451,1.13410,1.20015
12452,1.14226,1.19834
12453,1.12871,1.19649
12454,1.13970,1.19466
12455,1.12228,1.19309
12456,1.12767,1.19162
12457,1.11473,1.18983
12458,1.12068,1.18853
12459,0.99048,1.18531
12460,1.09522,1.18360
12461,1.15067,1.18373
12462,1.15494,1.18223
12463,1.14321,1.18073
12464,1.14347,1.17944
12465,1.13313,1.17813
12466,1.11531,1.17624
12467,1.12661,1.17500
12468,1.14382,1.17400
12469,1.14388,1.17276
12470,1.13798,1.17134
12471,1.12359,1.16969
12472,1.10307,1.16817
12473,1.12793,1.16685
12474,1.11965,1.16547
12475,1.10161,1.16556
12476,1.14084,1.16424
12477,1.09388,1.16268
12478,1.13356,1.16159
12479,1.10994,1.16020
12480,1.12920,1.15920
12481,1.13988,1.15830
12482,1.12438,1.15726
12483,1.11985,1.15619
12484,1.11096,1.15503
12485,1.10061,1.15391
12486,1.09546,1.15247
12487,1.08794,1.15120
12488,1.11201,1.15046
12489,1.14520,1.14970
12490,1.01070,1.14767
12491,1.09514,1.14657
12492,1.10143,1.14549
12493,1.13188,1.14477
12494,1.12600,1.14390
12495,1.12776,1.14306
12496,1.10251,1.14205
12497,1.08993,1.14085
12498,1.10871,1.14006
12499,1.10097,1.13917
12500,1.11323,1.13841
12501,1.10705,1.13772
12502,1.08650,1.13639
12503,1.09355,1.13549
12504,1.11097,1.13474
12505,1.11256,1.13400
12506,1.09651,1.13329
12507,1.10049,1.13435
12508,1.09876,1.13363
12509,1.09803,1.13256
12510,1.07465,1.13137
12511,1.11090,1.13059
12512,1.12443,1.12990
12513,1.11813,1.12922
12514,0.94589,1.12666
12515,1.10457,1.12672
12516,1.06351,1.12548
12517,1.10221,1.12482
12518,1.07833,1.12403
12519,1.10617,1.12337
12520,1.04814,1.12222
12521,1.10726,1.12143
12522,1.08117,1.12076
12523,1.09914,1.11998
12524,1.09727,1.11938
12525,1.10100,1.11858
12526,1.08595,1.11804
12527,1.09544,1.11718
12528,1.08513,1.11676
12529,1.08223,1.11593
12530,1.08669,1.11486
12531,1.09504,1.11424
12532,1.09557,1.11361
12533,1.08303,1.11287
12534,1.07584,1.11211
12535,1.06939,1.11110
12536,1.09462,1.11051
12537,1.08427,1.10985
12538,1.09657,1.10950
12539,1.09960,1.10882
12540,1.07074,1.10807
12541,1.06049,1.10888
12542,1.10061,1.10826
12543,1.07886,1.10713
12544,1.07196,1.10626
12545,1.07438,1.10556
12546,1.07889,1.10477
12547,1.03520,1.10359
12548,1.09984,1.10296
12549,1.09872,1.10260
12550,1.08874,1.10192

store below script to plot.sh

#!/bin/bash
TMP_FILE="data.csv"
gnuplot <<EOF
    set terminal png truecolor size 1280,720
    set output "output.png"
    set autoscale
    set title "Tacotron Loss vs Step "
    set xlabel 'Step number'
    set ylabel 'Loss value'
    set datafile separator ','
    plot '$TMP_FILE' u 1:2 title "loss" with point pointtype 1 ,'' u 1:3 title "avg loss" with line linetype 3 linewidth 2
EOF
display output.png

run ./plot.sh to get the Loss vs Step picture

Not sure if this is so called "gradient explosion" to my result or this is normal behaviour ? Do I need to retrain the data or just wait it to gose done again?

Rayhane-mamah commented 6 years ago

@butterl, yes the wavenet generation speed is too slow even with GPU due to the dependency of each audio step on all prevous steps, so the generation is done one step at a time wich can be extremely slow. (Way too far from real time synthesis).

As for the loss reported, just by looking at the numbers I can say it's ugly... To be able to solve this I will need some logs.

If you used my repo dor the generation you should have some stats saved under tensorboard from which is a gradient values curve. Could you run tensorboard --logdir='logs-Tacotron' and screenshot that gradients curve?

It would also help to see some alignment plots and some mels. You can find them under logs-Tacotron too.

imdatceleste commented 6 years ago

Actually the fluctuation at the beginning (12k steps) seems normal for me, look at the chart below: screen shot 2018-04-10 at 15 09 24

It depends a lot on the data that you are using. How easy/difficult is it? The only weird thing is that it dropped fast to 0.7 and then started climbing. That normally is a sign of overfit. @butterl : how big is your data?

Rayhane-mamah commented 6 years ago

@imdatsolak, thanks for your intervention, your overall curve seems okey, especially if the reported curve is with 0 smoothing.

But the 0.7 to 1.1 jump seems like an explosion for me. It's interesting to think of it as overfit. I'm also interested in information about your data @butterl

imdatceleste commented 6 years ago

@Rayhane-mamah, it is indeed with 0 smoothing :-)

butterl commented 6 years ago

@Rayhane-mamah @imdatsolak Thanks again for all your attentions ! The Test data I use is LJ Speech-1.1 with no modifications ,code I am using is Rayhane’s repo with no change and argv passed just follow the readme.

The data attached is just a small piece of the whole run(one of the big jumps). It's gose done even faster than imdatsolak‘s chart from 0 step,around 7K it's gose done to 0.7x and hang around 0.7-0.8,and sometimes gose to 0.6x

The big jumps happens 4 times(0.7x to >1.4x),the step is around 5000,9500,12000,16000 The small jumps happens 5 times (0.x to >1.0x), almost between each two big jumps

All jumps goes up soon and then slowly goes down to 0.7-0.8 and then next jump happens Some alignment plots (step before the jump) looks good continuity and bright ,some even looked diagonal, those lines are thin,some of the lines are wide but bright (do not know which is good ). The line even disappear after the jumps.

For mel plot I do not know how to read it (alignment line thin the pred look more clear to the real, any data to show the different ?)

Beacuse I do test with office server,the IT guys block the net and not able to upload files and screen shot with camera is also blocked in office :( only rawdata with text could be uploaded(1000 line, github will stuck :( ).

Any raw data with text could help to show what happened in the log dir? I can cut that out and if I know how to calculate, I could wrote script for plot.

butterl commented 6 years ago

@Rayhane-mamah the gradient values curve is model/stats/max_gradient_norm? If so the plot is really ugly , it goes up/down fast and almost could not see lines in small plot Do I need to stop and train again from some of the good steps(align line is thin and bright) with tran.py -- step xxx? Or I could envalute with the assigned step to compare the result?

imdatceleste commented 6 years ago

@butterl, can you post the model/loss chart?

Rayhane-mamah commented 6 years ago

@butterl, yes from your description you're experiencing gradient explosions. You're using a batch size of 32 right? If you made any changes would be awesome to share the hparams.

I think I'm going to bring back the gradient clipping then, since explosions didn't happen to me, I will also try to find the best initial states to reproduce from. But you most likely will have to run again.

Despite that can really help to see what your max gradient norms look like, could you read for us the initial value and the explosion values?

If you are not really in a rush, give me a few days to figure out the best solution for gradient explosions and I'll update the repo.

Thank you so much for reporting this! It helps me a lot!

butterl commented 6 years ago

@imdatsolak Sorry I tried communicate with IT guy,but not able to upload files only txt ,the IT block the file upload way :( @Rayhane-mamah I'm using your repo without code changes ,not sure where to check batch size ( tacotron_batch_size: 32 in log ?)

butter@ubuntu16.04:~/code/ML/Tacotron-2$ python3 train.py --model='Tacotron'
Checkpoint path: logs-Tacotron/pretrained/model.ckpt
Loading training data from: training_data/train.txt
Using model: Tacotron
Hyperparameters:
  allow_clipping_in_normalization: True
  attention_dim: 128
  attention_filters: 32
  attention_kernel: (31,)
  cleaners: english_cleaners
  decoder_layers: 2
  decoder_lstm_units: 1024
  embedding_dim: 512
  enc_conv_channels: 512
  enc_conv_kernel_size: (5,)
  enc_conv_num_layers: 3
  encoder_lstm_units: 256
  fft_size: 1024
  fmax: 7600
  fmin: 125
  frame_shift_ms: None
  griffin_lim_iters: 60
  hop_size: 256
  impute_finished: False
  input_type: mulaw-quantize
  log_scale_min: -32.23619130191664
  mask_encoder: False
  mask_finished: False
  max_abs_value: 4.0
  max_iters: 1000
  mel_normalization: True
  min_level_db: -100
  num_mels: 80
  outputs_per_step: 5
  postnet_channels: 512
  postnet_kernel_size: (5,)
  postnet_num_layers: 5
  power: 1.55
  prenet_layers: [256, 256]
  quantize_channels: 256
  ref_level_db: 20
  rescale: True
  rescaling_max: 0.999
  sample_rate: 22050
  silence_threshold: 2
  smoothing: False
  stop_at_any: True
  symmetric_mels: True
  tacotron_adam_beta1: 0.9
  tacotron_adam_beta2: 0.999
  tacotron_adam_epsilon: 1e-06
  tacotron_batch_size: 32
  tacotron_decay_learning_rate: True
  tacotron_decay_rate: 0.4
  tacotron_decay_steps: 50000
  tacotron_dropout_rate: 0.5
  tacotron_final_learning_rate: 1e-05
  tacotron_initial_learning_rate: 0.001
  tacotron_reg_weight: 1e-06
  tacotron_teacher_forcing_ratio: 1.0
  tacotron_zoneout_rate: 0.1
  trim_silence: True
Loaded metadata for 13100 examples (23.94 hours)
WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py:497: calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead
Initialized Tacotron model. Dimensions:
  embedding:                (?, ?, 512)
  enc conv out:             (?, ?, 512)
  encoder out:              (?, ?, 512)
  decoder out:              (?, ?, 80)
  residual out:             (?, ?, 512)
  projected residual out:   (?, ?, 80)
  mel out:                  (?, ?, 80)
  <stop_token> out:         (?, ?)
Loading checkpoint logs-Tacotron/pretrained/model.ckpt-21000

The jump data download from tensorboard is as below,contained 2 big jump

Step,loss,max_gradient_norm,regularization_loss,stop_token_loss,regularization_loss
12050,0.611843228,0.112030216,0.071008533,0.003464922,0.071008533
12060,0.739137053,0.110948727,0.070713565,0.008316508,0.070713565
12100,0.747518241,0.144256592,0.070642903,0.00296355,0.070642903
12110,0.740205586,0.132016793,0.070546977,0.006828221,0.070546977
12130,0.750570953,0.120335609,0.070430093,0.006633542,0.070430093
12160,0.759309351,0.116708465,0.070357963,0.003854881,0.070357963
12170,0.72936821,0.13562651,0.070002913,0.001883939,0.070002913
12220,0.725343168,0.091937497,0.069804862,0.001851804,0.069804862
12250,0.753307283,0.102390818,0.069650367,0.002984749,0.069650367
12270,0.735786557,0.109961003,0.08495605,0.027086332,0.08495605
12340,1.370556593,0.13626653,0.086065911,0.033558819,0.086065911
12380,1.248179197,0.175905153,0.085974246,0.011645812,0.085974246
12400,1.17056334,0.085426986,0.085835867,0.009806577,0.085835867
12420,1.168106675,0.147485837,0.085756667,0.009758449,0.085756667
12430,1.138239145,0.06291227,0.085670494,0.016407847,0.085670494
12440,1.136827826,0.10009516,0.085365139,0.009582978,0.085365139
12480,1.1210742,0.085175715,0.085282207,0.011879581,0.085282207
12490,1.102598429,0.122031979,0.085198745,0.006547623,0.085198745
12500,1.082708001,0.075820699,0.085109487,0.008799719,0.085109487
12510,1.085464001,0.091476515,0.084755346,0.011541827,0.084755346
12550,1.067399621,0.119814202,0.08466024,0.017261721,0.08466024
12560,1.096436739,0.108818218,0.084471397,0.008451841,0.084471397
12580,1.066828251,0.07031548,0.084384784,0.008230443,0.084384784
12590,1.082838297,0.13125442,0.084119253,0.010717451,0.084119253
12620,1.081726789,0.090116605,0.084031031,0.015666889,0.084031031
12630,1.065943003,0.149129018,0.083677426,0.009643696,0.083677426
12670,1.059858203,0.091585115,0.083583906,0.008071805,0.083583906
12680,1.027458668,0.112211913,0.08339978,0.009419976,0.08339978
12700,1.058202744,0.148930088,0.083149202,0.007165685,0.083149202
12730,1.027272344,0.08455649,0.083060525,0.008071801,0.083060525
12740,1.026104212,0.079256609,0.082881995,0.003858527,0.082881995
12760,1.011251092,0.085232109,0.082795501,0.004411653,0.082795501
12770,1.011392474,0.09608084,0.082541533,0.008189212,0.082541533
12800,1.035112619,0.126745135,0.082456142,0.004751964,0.082456142
12810,0.999344945,0.08664085,0.082374282,0.006347341,0.082374282
12820,0.973199248,0.106476478,0.082144141,0.0142597,0.082144141
12850,0.834679008,0.116564892,0.082070634,0.006195488,0.082070634
12860,1.00530076,0.10727413,0.081965744,0.005334616,0.081965744
12880,0.981494009,0.168403625,0.082580239,0.003246528,0.082580239
12970,0.915475607,0.131064907,0.08262796,0.005490544,0.08262796
13030,0.876445234,0.094742544,0.082579195,0.006532505,0.082579195
13050,0.860801637,0.111322284,0.082681589,0.003549797,0.082681589
13080,0.93151629,0.189117998,0.082817182,0.005221263,0.082817182
13100,0.871587694,0.115424655,0.082806811,0.002572627,0.082806811
13110,0.859839797,0.109444052,0.082691923,0.007954158,0.082691923
13130,0.864843786,0.16708982,0.082636535,0.004949848,0.082636535
13140,0.83871007,0.088102542,0.082574256,0.002441357,0.082574256
13150,0.830569804,0.089837939,0.08244849,0.003939637,0.08244849
13170,0.838312089,0.096750237,0.082376868,0.004068728,0.082376868
13180,0.822465658,0.12148872,0.082174703,0.003512312,0.082174703
13220,0.831818581,0.107799463,0.082144164,0.004382603,0.082144164
13230,0.851913929,0.131651953,0.082094096,0.002879821,0.082094096
13240,0.823227286,0.102097981,0.082055949,0.005599013,0.082055949
13260,0.815093637,0.125923991,0.081944421,0.009794,0.081944421
13280,0.768396795,0.335885733,0.081874028,0.004078984,0.081874028
13290,0.823944688,0.108403482,0.081652381,0.004680538,0.081652381
13320,0.797723889,0.109808676,0.081643693,0.002971901,0.081643693
13330,0.824786067,0.13506344,0.081543662,0.003521613,0.081543662
13360,0.8102265,0.089439601,0.081476733,0.005942946,0.081476733
13370,0.781555533,0.102970414,0.081402853,0.009310364,0.081402853
13380,0.844770789,0.286836654,0.081238426,0.003072675,0.081238426
13400,0.816966891,0.162778854,0.080991939,0.005731466,0.080991939
13430,0.833023131,0.142372489,0.080858141,0.002230477,0.080858141
13450,0.818431139,0.203611702,0.080572657,0.004789582,0.080572657
13490,0.805221379,0.10118077,0.080490671,0.003316968,0.080490671
13500,0.793731093,0.104873769,0.080359705,0.00268034,0.080359705
13520,0.790014744,0.127511352,0.080192327,0.002834486,0.080192327
13540,0.788462698,0.089527175,0.080110937,0.003922335,0.080110937
13550,0.787808657,0.119172759,0.079786733,0.002705841,0.079786733
13600,0.792235076,0.100692987,0.079464279,0.002676543,0.079464279
13650,0.792463362,0.177826181,0.079253837,0.00737846,0.079253837

15590,0.708203316,0.115900449,0.068603694,0.00226788,0.068542115
15620,0.729309797,0.113264784,0.068542115,0.011683479,0.068117537
15630,0.616081476,0.153299287,0.068117537,0.003710024,0.067929842
15710,0.714654982,0.114791445,0.067929842,0.00419116,0.067881018
15750,0.711220622,0.106338121,0.067881018,0.006002362,0.067761898
15760,0.604212523,0.131355032,0.067761898,0.002348174,0.067716703
15780,0.736393631,0.326502442,0.067716703,0.002861926,0.067617752
15790,0.698725879,0.0972858,0.067617752,0.003370562,0.067564443
15810,0.722059429,0.106332257,0.067564443,0.003554587,0.067521259
15820,0.737670839,0.117765576,0.067521259,0.002839184,0.067429639
15830,0.722741961,0.125151366,0.067429639,0.004136955,0.067383021
15850,0.714020669,0.150361776,0.067383021,0.003151822,0.067320548
15860,0.720853686,0.096505545,0.067320548,0.006715457,0.067154162
15870,0.718049228,0.150164217,0.067154162,0.002279772,0.067020714
15900,0.732673407,0.097184338,0.067020714,0.005029981,0.066895396
15930,0.731613338,0.120067552,0.066895396,0.00618826,0.066797532
15960,0.634255528,0.140578613,0.066797532,0.002298817,0.066705465
15980,0.724768996,0.165089563,0.066705465,0.002176684,0.066586114
16000,0.715796232,0.175387964,0.066586114,0.002866265,0.066532493
16020,0.747617543,0.099989027,0.066532493,0.003406355,0.066477537
16030,0.708664834,0.108932577,0.066477537,0.003913892,0.066422559
16040,0.743779182,0.098615795,0.066422559,0.004818851,0.066307664
16050,0.715792,0.210577741,0.066307664,0.002957923,0.067704134
16070,0.72688669,0.23847352,0.067704134,0.039474022,0.07627628
16090,1.000308633,0.47720021,0.07627628,0.014406082,0.077275805
16120,1.253042102,0.20134142,0.077275805,0.022394232,0.077792622
16130,1.188810349,0.206852481,0.077792622,0.011859883,0.077799551
16150,1.105454922,0.118017778,0.077799551,0.011414431,0.077764615
16160,1.08352685,0.175189823,0.077764615,0.011442578,0.077717885
16170,1.122390747,0.276549995,0.077717885,0.008220224,0.077594154
16180,1.040466666,0.149369702,0.077594154,0.00889876,0.077525482
16200,1.041304946,0.125025764,0.077525482,0.009072677,0.077460811
16210,0.977941871,0.121666402,0.077460811,0.006492055,0.077379569
16220,1.011223674,0.102190353,0.077379569,0.012126213,0.077291399
16230,0.97561717,0.137912512,0.077291399,0.007127115,0.077127345
16240,1.009298325,0.168408766,0.077127345,0.009019298,0.077075161
16260,0.989142478,0.141399339,0.077075161,0.007226328,0.077240445
16270,0.946738422,0.174237818,0.077240445,0.005271308,0.077396065
16290,0.933232069,0.126569822,0.077396065,0.006312555,0.077390209
16320,0.8707304,0.169570521,0.077390209,0.003728667,0.077758826
16330,0.841205418,0.109692238,0.077758826,0.008199131,0.078195244
16350,0.966707587,0.132055283,0.078195244,0.005523407,0.078208193
16380,0.846445084,0.110721081,0.078208193,0.005482097,0.078140058
16390,0.858823776,0.135719448,0.078140058,0.002754007,0.078078449
16420,0.814995527,0.099246465,0.078078449,0.005281731,0.078032844
16430,0.807288051,0.139662236,0.078032844,0.002667563,0.078056589
16440,0.805120349,0.116086289,0.078056589,0.003179136,0.07803835
16460,0.817798376,0.09544038,0.07803835,0.002964768,0.078031875
16480,0.813851893,0.101294272,0.078031875,0.005263099,0.077885643
16490,0.794899344,0.107104316,0.077885643,0.003799665,0.077831604
16530,0.789055645,0.161296904,0.077831604,0.002449155,0.077621832
16540,0.800640821,0.104023106,0.077621832,0.005187268,0.077583879
16570,0.805860043,0.129499629,0.077583879,0.003462702,0.077338338
16580,0.816913962,0.196018636,0.077338338,0.003640001,0.077270277
16630,0.783132374,0.130431846,0.077270277,0.003855656,0.077037744
16640,0.782408416,0.114493221,0.077037744,0.008940279,0.076960973
16700,0.789887309,0.114049576,0.076960973,0.003087203,0.076895684
16720,0.771683455,0.132404074,0.076895684,0.003073983,0.076762132
16730,0.759126782,0.099398829,0.076762132,0.003220459,0.076695405
16750,0.756430328,0.11897333,0.076695405,0.008053106,0.076617695
16760,0.717152596,0.139100939,0.076617695,0.00336361,0.076426379

and I tried stop and the evaluation 32 wavs,the echo is very loud(mel or align reason?) but the tone seems good

Rayhane-mamah commented 6 years ago

It's getting crowded in here! At first sight i'm not seeing gradient peeks..

I'll try reconstructing the plot later when I get home and look for gradient peeks. We'll see how it goes after that.

Sorry for the inconvenience, I'll come back to you as soon as possible.

On Wed, 11 Apr 2018, 09:02 butterl, notifications@github.com wrote:

@imdatsolak https://github.com/imdatsolak Sorry I tried communicate with IT guy,but not able to upload files only txt ,the IT block the file upload way :( @Rayhane-mamah https://github.com/Rayhane-mamah I'm using your repo without code changes ,not sure where to check batch size ( tacotron_batch_size: 32 in log ?)

butter@ubuntu16.04:~/code/ML/Tacotron-2$ python3 train.py --model='Tacotron' Checkpoint path: logs-Tacotron/pretrained/model.ckpt Loading training data from: training_data/train.txt Using model: Tacotron Hyperparameters: allow_clipping_in_normalization: True attention_dim: 128 attention_filters: 32 attention_kernel: (31,) cleaners: english_cleaners decoder_layers: 2 decoder_lstm_units: 1024 embedding_dim: 512 enc_conv_channels: 512 enc_conv_kernel_size: (5,) enc_conv_num_layers: 3 encoder_lstm_units: 256 fft_size: 1024 fmax: 7600 fmin: 125 frame_shift_ms: None griffin_lim_iters: 60 hop_size: 256 impute_finished: False input_type: mulaw-quantize log_scale_min: -32.23619130191664 mask_encoder: False mask_finished: False max_abs_value: 4.0 max_iters: 1000 mel_normalization: True min_level_db: -100 num_mels: 80 outputs_per_step: 5 postnet_channels: 512 postnet_kernel_size: (5,) postnet_num_layers: 5 power: 1.55 prenet_layers: [256, 256] quantize_channels: 256 ref_level_db: 20 rescale: True rescaling_max: 0.999 sample_rate: 22050 silence_threshold: 2 smoothing: False stop_at_any: True symmetric_mels: True tacotron_adam_beta1: 0.9 tacotron_adam_beta2: 0.999 tacotron_adam_epsilon: 1e-06 tacotron_batch_size: 32 tacotron_decay_learning_rate: True tacotron_decay_rate: 0.4 tacotron_decay_steps: 50000 tacotron_dropout_rate: 0.5 tacotron_final_learning_rate: 1e-05 tacotron_initial_learning_rate: 0.001 tacotron_reg_weight: 1e-06 tacotron_teacher_forcing_ratio: 1.0 tacotron_zoneout_rate: 0.1 trim_silence: True Loaded metadata for 13100 examples (23.94 hours) WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py:497: calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version. Instructions for updating: NHWC for data_format is deprecated, use NWC instead Initialized Tacotron model. Dimensions: embedding: (?, ?, 512) enc conv out: (?, ?, 512) encoder out: (?, ?, 512) decoder out: (?, ?, 80) residual out: (?, ?, 512) projected residual out: (?, ?, 80) mel out: (?, ?, 80)

out: (?, ?) Loading checkpoint logs-Tacotron/pretrained/model.ckpt-21000 The jump data download from tensorboard is as below,contained 2 big jump Step,loss,max_gradient_norm,regularization_loss,stop_token_loss,regularization_loss 12050,0.611843228,0.112030216,0.071008533,0.003464922,0.071008533 12060,0.739137053,0.110948727,0.070713565,0.008316508,0.070713565 12100,0.747518241,0.144256592,0.070642903,0.00296355,0.070642903 12110,0.740205586,0.132016793,0.070546977,0.006828221,0.070546977 12130,0.750570953,0.120335609,0.070430093,0.006633542,0.070430093 12160,0.759309351,0.116708465,0.070357963,0.003854881,0.070357963 12170,0.72936821,0.13562651,0.070002913,0.001883939,0.070002913 12220,0.725343168,0.091937497,0.069804862,0.001851804,0.069804862 12250,0.753307283,0.102390818,0.069650367,0.002984749,0.069650367 12270,0.735786557,0.109961003,0.08495605,0.027086332,0.08495605 12340,1.370556593,0.13626653,0.086065911,0.033558819,0.086065911 12380,1.248179197,0.175905153,0.085974246,0.011645812,0.085974246 12400,1.17056334,0.085426986,0.085835867,0.009806577,0.085835867 12420,1.168106675,0.147485837,0.085756667,0.009758449,0.085756667 12430,1.138239145,0.06291227,0.085670494,0.016407847,0.085670494 12440,1.136827826,0.10009516,0.085365139,0.009582978,0.085365139 12480,1.1210742,0.085175715,0.085282207,0.011879581,0.085282207 12490,1.102598429,0.122031979,0.085198745,0.006547623,0.085198745 12500,1.082708001,0.075820699,0.085109487,0.008799719,0.085109487 12510,1.085464001,0.091476515,0.084755346,0.011541827,0.084755346 12550,1.067399621,0.119814202,0.08466024,0.017261721,0.08466024 12560,1.096436739,0.108818218,0.084471397,0.008451841,0.084471397 12580,1.066828251,0.07031548,0.084384784,0.008230443,0.084384784 12590,1.082838297,0.13125442,0.084119253,0.010717451,0.084119253 12620,1.081726789,0.090116605,0.084031031,0.015666889,0.084031031 12630,1.065943003,0.149129018,0.083677426,0.009643696,0.083677426 12670,1.059858203,0.091585115,0.083583906,0.008071805,0.083583906 12680,1.027458668,0.112211913,0.08339978,0.009419976,0.08339978 12700,1.058202744,0.148930088,0.083149202,0.007165685,0.083149202 12730,1.027272344,0.08455649,0.083060525,0.008071801,0.083060525 12740,1.026104212,0.079256609,0.082881995,0.003858527,0.082881995 12760,1.011251092,0.085232109,0.082795501,0.004411653,0.082795501 12770,1.011392474,0.09608084,0.082541533,0.008189212,0.082541533 12800,1.035112619,0.126745135,0.082456142,0.004751964,0.082456142 12810,0.999344945,0.08664085,0.082374282,0.006347341,0.082374282 12820,0.973199248,0.106476478,0.082144141,0.0142597,0.082144141 12850,0.834679008,0.116564892,0.082070634,0.006195488,0.082070634 12860,1.00530076,0.10727413,0.081965744,0.005334616,0.081965744 12880,0.981494009,0.168403625,0.082580239,0.003246528,0.082580239 12970,0.915475607,0.131064907,0.08262796,0.005490544,0.08262796 13030,0.876445234,0.094742544,0.082579195,0.006532505,0.082579195 13050,0.860801637,0.111322284,0.082681589,0.003549797,0.082681589 13080,0.93151629,0.189117998,0.082817182,0.005221263,0.082817182 13100,0.871587694,0.115424655,0.082806811,0.002572627,0.082806811 13110,0.859839797,0.109444052,0.082691923,0.007954158,0.082691923 13130,0.864843786,0.16708982,0.082636535,0.004949848,0.082636535 13140,0.83871007,0.088102542,0.082574256,0.002441357,0.082574256 13150,0.830569804,0.089837939,0.08244849,0.003939637,0.08244849 13170,0.838312089,0.096750237,0.082376868,0.004068728,0.082376868 13180,0.822465658,0.12148872,0.082174703,0.003512312,0.082174703 13220,0.831818581,0.107799463,0.082144164,0.004382603,0.082144164 13230,0.851913929,0.131651953,0.082094096,0.002879821,0.082094096 13240,0.823227286,0.102097981,0.082055949,0.005599013,0.082055949 13260,0.815093637,0.125923991,0.081944421,0.009794,0.081944421 13280,0.768396795,0.335885733,0.081874028,0.004078984,0.081874028 13290,0.823944688,0.108403482,0.081652381,0.004680538,0.081652381 13320,0.797723889,0.109808676,0.081643693,0.002971901,0.081643693 13330,0.824786067,0.13506344,0.081543662,0.003521613,0.081543662 13360,0.8102265,0.089439601,0.081476733,0.005942946,0.081476733 13370,0.781555533,0.102970414,0.081402853,0.009310364,0.081402853 13380,0.844770789,0.286836654,0.081238426,0.003072675,0.081238426 13400,0.816966891,0.162778854,0.080991939,0.005731466,0.080991939 13430,0.833023131,0.142372489,0.080858141,0.002230477,0.080858141 13450,0.818431139,0.203611702,0.080572657,0.004789582,0.080572657 13490,0.805221379,0.10118077,0.080490671,0.003316968,0.080490671 13500,0.793731093,0.104873769,0.080359705,0.00268034,0.080359705 13520,0.790014744,0.127511352,0.080192327,0.002834486,0.080192327 13540,0.788462698,0.089527175,0.080110937,0.003922335,0.080110937 13550,0.787808657,0.119172759,0.079786733,0.002705841,0.079786733 13600,0.792235076,0.100692987,0.079464279,0.002676543,0.079464279 13650,0.792463362,0.177826181,0.079253837,0.00737846,0.079253837 15590,0.708203316,0.115900449,0.068603694,0.00226788,0.068542115 15620,0.729309797,0.113264784,0.068542115,0.011683479,0.068117537 15630,0.616081476,0.153299287,0.068117537,0.003710024,0.067929842 15710,0.714654982,0.114791445,0.067929842,0.00419116,0.067881018 15750,0.711220622,0.106338121,0.067881018,0.006002362,0.067761898 15760,0.604212523,0.131355032,0.067761898,0.002348174,0.067716703 15780,0.736393631,0.326502442,0.067716703,0.002861926,0.067617752 15790,0.698725879,0.0972858,0.067617752,0.003370562,0.067564443 15810,0.722059429,0.106332257,0.067564443,0.003554587,0.067521259 15820,0.737670839,0.117765576,0.067521259,0.002839184,0.067429639 15830,0.722741961,0.125151366,0.067429639,0.004136955,0.067383021 15850,0.714020669,0.150361776,0.067383021,0.003151822,0.067320548 15860,0.720853686,0.096505545,0.067320548,0.006715457,0.067154162 15870,0.718049228,0.150164217,0.067154162,0.002279772,0.067020714 15900,0.732673407,0.097184338,0.067020714,0.005029981,0.066895396 15930,0.731613338,0.120067552,0.066895396,0.00618826,0.066797532 15960,0.634255528,0.140578613,0.066797532,0.002298817,0.066705465 15980,0.724768996,0.165089563,0.066705465,0.002176684,0.066586114 16000,0.715796232,0.175387964,0.066586114,0.002866265,0.066532493 16020,0.747617543,0.099989027,0.066532493,0.003406355,0.066477537 16030,0.708664834,0.108932577,0.066477537,0.003913892,0.066422559 16040,0.743779182,0.098615795,0.066422559,0.004818851,0.066307664 16050,0.715792,0.210577741,0.066307664,0.002957923,0.067704134 16070,0.72688669,0.23847352,0.067704134,0.039474022,0.07627628 16090,1.000308633,0.47720021,0.07627628,0.014406082,0.077275805 16120,1.253042102,0.20134142,0.077275805,0.022394232,0.077792622 16130,1.188810349,0.206852481,0.077792622,0.011859883,0.077799551 16150,1.105454922,0.118017778,0.077799551,0.011414431,0.077764615 16160,1.08352685,0.175189823,0.077764615,0.011442578,0.077717885 16170,1.122390747,0.276549995,0.077717885,0.008220224,0.077594154 16180,1.040466666,0.149369702,0.077594154,0.00889876,0.077525482 16200,1.041304946,0.125025764,0.077525482,0.009072677,0.077460811 16210,0.977941871,0.121666402,0.077460811,0.006492055,0.077379569 16220,1.011223674,0.102190353,0.077379569,0.012126213,0.077291399 16230,0.97561717,0.137912512,0.077291399,0.007127115,0.077127345 16240,1.009298325,0.168408766,0.077127345,0.009019298,0.077075161 16260,0.989142478,0.141399339,0.077075161,0.007226328,0.077240445 16270,0.946738422,0.174237818,0.077240445,0.005271308,0.077396065 16290,0.933232069,0.126569822,0.077396065,0.006312555,0.077390209 16320,0.8707304,0.169570521,0.077390209,0.003728667,0.077758826 16330,0.841205418,0.109692238,0.077758826,0.008199131,0.078195244 16350,0.966707587,0.132055283,0.078195244,0.005523407,0.078208193 16380,0.846445084,0.110721081,0.078208193,0.005482097,0.078140058 16390,0.858823776,0.135719448,0.078140058,0.002754007,0.078078449 16420,0.814995527,0.099246465,0.078078449,0.005281731,0.078032844 16430,0.807288051,0.139662236,0.078032844,0.002667563,0.078056589 16440,0.805120349,0.116086289,0.078056589,0.003179136,0.07803835 16460,0.817798376,0.09544038,0.07803835,0.002964768,0.078031875 16480,0.813851893,0.101294272,0.078031875,0.005263099,0.077885643 16490,0.794899344,0.107104316,0.077885643,0.003799665,0.077831604 16530,0.789055645,0.161296904,0.077831604,0.002449155,0.077621832 16540,0.800640821,0.104023106,0.077621832,0.005187268,0.077583879 16570,0.805860043,0.129499629,0.077583879,0.003462702,0.077338338 16580,0.816913962,0.196018636,0.077338338,0.003640001,0.077270277 16630,0.783132374,0.130431846,0.077270277,0.003855656,0.077037744 16640,0.782408416,0.114493221,0.077037744,0.008940279,0.076960973 16700,0.789887309,0.114049576,0.076960973,0.003087203,0.076895684 16720,0.771683455,0.132404074,0.076895684,0.003073983,0.076762132 16730,0.759126782,0.099398829,0.076762132,0.003220459,0.076695405 16750,0.756430328,0.11897333,0.076695405,0.008053106,0.076617695 16760,0.717152596,0.139100939,0.076617695,0.00336361,0.076426379 and I tried stop and the evaluation 32 wavs,the echo is very loud(mel or align reason?) but the tone seems good — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub , or mute the thread .
Rayhane-mamah commented 6 years ago

Hello again,

After checking those plots, there is clearly some loss explosion around 16k+, I also expected the gradients to be way bigger than just 0.4.. I am assuming that mel before loss is the one that exploded..

I am actually trying to find a solution for this, I also want to point out that I couldn't reproduce.. I'll keep you informed.

Thank you for your patience.

Rayhane-mamah commented 6 years ago

Sorry, closed by accident.

butterl commented 6 years ago

@Rayhane-mamah thanks for your help again, checked google paper, seems they are using wavenet to generate wav from mel to wav, and now the model are using griffin from the code ( seems this would affect much in audio quality ) , could this be used to envalute(using mel file generated from the eval as in put and using their wavenet pretrained model?)

I checked sample from @twidddj ’s post. Seems with wavenet output the audio quality is very good compared to keithito(he use _griffin_lim too)

If you got some idea how to envaluate, I‘ll do the test and share it

Rayhane-mamah commented 6 years ago

With griffin lim inversion algorithm, we try reconstructing the phase roughly and it usually gives noisy audio, especially if we are inverting mel spectrograms directly.

In my repo, I am only using griffin lim to control pronunciation correctness and have a quick overview of the linguistic features the model has learned, wavenet will improves audio quality afterwards. I am trying to release wavenet vocoder in this repo as soon as possible.

On Fri, 13 Apr 2018, 03:20 butterl, notifications@github.com wrote:

@Rayhane-mamah https://github.com/Rayhane-mamah thanks for your help again, checked google paper, seems they are using wavenet to generate wav from mel to wav, and now the model are using griffin from the code ( seems this would affect much in audio quality ) , could this https://github.com/r9y9/wavenet_vocoder be used to envalute(using mel file generated from the eval as in put and using their wavenet pretrained model?)

I checked sample https://twidddj.github.io/docs/vocoder/ from @twidddj https://github.com/twidddj ’s post. Seems with wavenet output the audio quality is very good compared to keithito(he use _griffin_lim too)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Rayhane-mamah/Tacotron-2/issues/11#issuecomment-381000976, or mute the thread https://github.com/notifications/unsubscribe-auth/AhFSwD5gKfaBEwVByySRoN9XYS0otN_Wks5toAtWgaJpZM4TL2Zc .

Rayhane-mamah commented 6 years ago

Hello again @butterl,

So after doing some research, it turned out that regularization was causing those peeks.

Because outputs belong to [-4, 4], weights need to get somewhat "big" to be able to infer values in this range, L2 regularization however penalizes big valued parameters, so each now and then regularization loss was "exploding" and messing with the gradients.

Thnigs should be fixed now: loss

This is fixed in this commit (2b444e29bc3f949b831692563d62a80bc73440fb)

butterl commented 6 years ago

@Rayhane-mamah I tried new version the train speed slowed down from 7sec/step to 11sec/step with my CPU

And I borrowed a GTX 1060 6GB with windows 7 64 bit . The tacotron_batch_size = 16 is OK to run(OOM at 32) , 7 sec/step from begining, and speed gose up (generate batch takes 25 sec every 16 step )

Rayhane-mamah commented 6 years ago

That slow done is due to the feeder, you can use previous version feeder if you like, it will bring back speed to normal. For gpu usage this is not a problem because feeder is always set on cpu.

On Mon, 16 Apr 2018, 13:26 butterl, notifications@github.com wrote:

@Rayhane-mamah https://github.com/Rayhane-mamah I tried new version the train speed slowed down from 7sec/step to 11sec/step with my CPU

And I borrowed a GTX 1060 6GB with windows 7 64 bit . The tacotron_batch_size = 16 is OK to run(OOM at 32) , 7 sec/step from begining, and speed gose up (generate batch takes 25 sec every 16 step )

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Rayhane-mamah/Tacotron-2/issues/11#issuecomment-381581524, or mute the thread https://github.com/notifications/unsubscribe-auth/AhFSwOGUxrqaQpjpcd2UJO5rYXjjb8_Qks5tpI3zgaJpZM4TL2Zc .

butterl commented 6 years ago

already got 2 P100 ,so close this