[SOLVED] It is not training

Jinex2012 commented 6 years ago

This is pretty weird.

The graph Attention plot graph is also blank

alignment_009k

I restarted again, same issue 😕

The synthesised audio is blank. Each sentence produces an audio sample of 10 seconds of silence.

bprabhakar commented 6 years ago

Yup, facing the same issue. I'm trying to train on a different speech dataset though and the attention plot is blank even after 200k steps. Any help would be appreciated.

bprabhakar commented 6 years ago

I think the issue might be with the Tensorflow version. @Jinex2012 can you tell me what TF version are you using? It's failing for me on v1.8.

Jinex2012 commented 6 years ago

@bprabhakar I am using v1.5

NikhilReddy101995 commented 6 years ago

Hi, I am also facing same issue.Did you guys found the issue.Any help would be appreciated.

Jinex2012 commented 6 years ago

Still haven't figure it out

jemoal commented 6 years ago

Has anyone found a solution?

jemoal commented 6 years ago

@Jinex2012 and @bprabhakar ... are you running the code using tensorflow in CPU or GPU? ... I have realised that the author adds the queues to CPU (get_batch() function in 'load_data.py')... I do not know if this may be causing the issue... I have modified the code to put the queues into the GPU memory... but I get other errors that I am trying to solve

bprabhakar commented 6 years ago

So it was weird in my case. I was training using Tensorflow 1.8 (GPU) when I faced this issue. Basically when I tried printing the attention values that are being plotted here, they were becoming NaNs in a couple of steps. Very strangely, exactly the same piece of code worked perfectly fine when I switched to a different system with Tensorflow 1.7 (GPU). Perhaps some sorta numerical underflow/overflow?

jemoal commented 6 years ago

Thanks @bprabhakar .... I will try this....

jemoal commented 6 years ago

Hi again @bprabhakar ... Similar to you... the code has worked well using TF1.7 GPU. I think the problem is related to a specific kind of data used in the definition fon 'conv1D'.

Thanks.

Bye, @jemoal

arogozhnikov commented 6 years ago

@bprabhakar thanks for noting. Indeed, I have 1.8 and it diverges for me whatever I do.

arogozhnikov commented 6 years ago

it also works with tensorflow 1.9, but not tensorflow 1.8. I was not able to find something relevant in changelog

Jinex2012 commented 6 years ago

Thanks for the update. Maybe @Kyubyong can add something in the README.md about telling people not to use tensorflow 1.8?

Jinex2012 commented 6 years ago

I guess we can close the issue?

arogozhnikov commented 6 years ago

@Jinex2012 no, I would be stuck for days if I haven't seen this thread

arogozhnikov commented 6 years ago

ok, seems the problem in conv2d_transposed according to this thread https://github.com/tensorflow/tensorflow/issues/19200

redoc700 commented 6 years ago

I updated Tensorflow from version 1.8 to version 1.9 but still i am getting blank attention graph Any help will be appreciated!

shamidreza commented 5 years ago

I tried versions 1.3 to 1.9, GPU and CPU, all got black (NAN-filled) attention graph. Any solution?

wanshun123 commented 5 years ago

Training for me is working using the latest version of tensorflow-gpu (1.12 as of November 2018) installed through conda (installing the latest versions via pip failed with core aborted or illegal instruction errors). I first uninstalled everything:

pip uninstall tensorflow
pip uninstall tensorboard
pip uninstall tensorflow-gpu

And installed afresh through conda as follows:

conda create -n tensorflow
conda install tensorflow-gpu -n tensorflow

gorkemgoknar commented 5 years ago

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

DavidC001 commented 4 years ago

@gorkemgoknar have you solved it? I am facing the same issue

queries01 commented 4 years ago

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

queries01 commented 4 years ago

I am using cpu for training but graph plot drawing empty blank? how can i resolve it? tf version 1.15.0 and very high cpu 128gb ram and 48 core

gorkemgoknar commented 4 years ago

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

Nope I was not able to fix it. But using CPU for this is time wasting.I recomment you use Google Colab or Kaggle Kernels for trying this first (If you do not have GPU).

queries01 commented 4 years ago

I am trying to test with some 10-50 wav files (turkish text) and minimize the hidden units but I am failing to train on custom values (on Mac CPU). I tried tensorflow 1.8, 1.9, 1.12 (CPU) but I still get empty attention graph. When I check through synthesise passing some test texts graph values are all nan (noticed librosa giving errors on isftt). Anyone able to solve this problem (on CPU not GPU) ?

did u fix that err

Nope I was not able to fix it. But using CPU for this is time wasting.I recomment you use Google Colab or Kaggle Kernels for trying this first (If you do not have GPU).

colab and kaggle karnels are very slow working. Now I have physical server so I want to train on it :)

Kyubyong / dc_tts

[SOLVED] It is not training #21