TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.84k stars 815 forks source link

Tacotron2 training error (Esperanto) #692

Closed riproskaie closed 3 years ago

riproskaie commented 3 years ago

I constructed a dataset of Esperanto speech under the name "koreto." I'd like to make it work on TensorFlowTTS. My forked repository is available here: https://github.com/riproskaie/TensorFlowTTS

I created/modified the following files:

Afterwards, I tried training Tacotron2, by typing: !python ./examples/tacotron2/train_tacotron2.py --train-dir ./dump_koreto/train/ --dev-dir ./dump_koreto/valid/ --outdir ./examples/tacotron2/exp/train.tacotron2.koreto.v1/ --config ./examples/tacotron2/conf/tacotron2.koreto.v1.yaml --use-norm 1 --mixed_precision 0 --resume ""

And then I get this error:

2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: hop_size = 256
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: format = npy
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: model_type = tacotron2
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: tacotron2_params = {'dataset': 'koreto', 'embedding_hidden_size': 512, 'initializer_range': 0.02, 'embedding_dropout_prob': 0.1, 'n_speakers': 1, 'n_conv_encoder': 5, 'encoder_conv_filters': 512, 'encoder_conv_kernel_sizes': 5, 'encoder_conv_activation': 'relu', 'encoder_conv_dropout_rate': 0.5, 'encoder_lstm_units': 256, 'n_prenet_layers': 2, 'prenet_units': 256, 'prenet_activation': 'relu', 'prenet_dropout_rate': 0.5, 'n_lstm_decoder': 1, 'reduction_factor': 1, 'decoder_lstm_units': 1024, 'attention_dim': 128, 'attention_filters': 32, 'attention_kernel': 31, 'n_mels': 80, 'n_conv_postnet': 5, 'postnet_conv_filters': 512, 'postnet_conv_kernel_sizes': 5, 'postnet_dropout_rate': 0.1, 'attention_type': 'lsa'}
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: batch_size = 8
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: remove_short_samples = True
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: allow_cache = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: mel_length_threshold = 32
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: is_shuffle = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: use_fixed_shapes = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 1e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: gradient_accumulation_steps = 1
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: var_train_expr = None
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: train_max_steps = 200000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: save_interval_steps = 2000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: eval_interval_steps = 500
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: log_interval_steps = 200
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: start_schedule_teacher_forcing = 200001
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: start_ratio_value = 0.5
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: schedule_decay_steps = 50000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: end_ratio_value = 0.0
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: num_save_intermediate_results = 1
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: train_dir = ./dump_koreto/train/
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: dev_dir = ./dump_koreto/valid/
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: use_norm = True
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: outdir = ./examples/tacotron2/exp/train.tacotron2.koreto.v1/
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: config = ./examples/tacotron2/conf/tacotron2.koreto.v1.yaml
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: resume = 
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: verbose = 1
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: mixed_precision = False
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: pretrained = 
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: version = 0.0
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: max_mel_length = 2173
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: max_char_length = 191
2021-10-24 01:01:26,798 (ag_logging:146) WARNING: AutoGraph could not transform <bound method CharactorMelDataset._load_data of <tensorflow.python.eager.function.TfMethodTarget object at 0x000001EC039FC240>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2021-10-24 01:01:07.028137: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:13.952536: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2021-10-24 01:01:14.550550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2021-10-24 01:01:14.550612: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:14.713712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-10-24 01:01:14.758645: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-10-24 01:01:14.762322: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-10-24 01:01:14.799377: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-10-24 01:01:14.803781: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-10-24 01:01:14.865158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-10-24 01:01:15.023366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-10-24 01:01:21.583264: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-24 01:01:21.941158: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1ec73a88f30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-24 01:01:21.941212: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-10-24 01:01:22.420068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2021-10-24 01:01:22.420149: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:22.420182: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-10-24 01:01:22.420213: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-10-24 01:01:22.420243: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-10-24 01:01:22.420274: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-10-24 01:01:22.420305: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-10-24 01:01:22.420338: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-10-24 01:01:22.420458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-10-24 01:01:23.495539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-24 01:01:23.495569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2021-10-24 01:01:23.495586: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
2021-10-24 01:01:23.495996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1329 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-10-24 01:01:23.591156: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1ec10931780 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-24 01:01:23.591193: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce GTX 1050, Compute Capability 6.1
WARNING:tensorflow:AutoGraph could not transform <bound method CharactorMelDataset._load_data of <tensorflow.python.eager.function.TfMethodTarget object at 0x000001EC039FC240>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Traceback (most recent call last):
  File "./examples/tacotron2/train_tacotron2.py", line 513, in <module>
    main()
  File "./examples/tacotron2/train_tacotron2.py", line 428, in main
    * config["gradient_accumulation_steps"],
  File ".\examples\tacotron2\tacotron_dataset.py", line 186, in create
    tf.data.experimental.AUTOTUNE
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 1702, in map
    preserve_cardinality=True)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 4084, in __init__
    use_legacy_function=use_legacy_function)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3371, in __init__
    self._function = wrapper_fn.get_concrete_function()
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 2939, in get_concrete_function
    *args, **kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 2906, in _get_concrete_function_garbage_collected
    graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 3075, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3364, in wrapper_fn
    ret = _wrapper_helper(*args)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3299, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
  File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\autograph\impl\api.py", line 258, in wrapper
    raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:

    .\examples\tacotron2\tacotron_dataset.py:185 None  *
        lambda items: self._load_data(items),
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:780 __call__  **
        result = self._call(*args, **kwds)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:823 _call
        self._initialize(args, kwds, add_initializers_to=initializers)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:697 _initialize
        *args, **kwds))
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:2855 _get_concrete_function_internal_garbage_collected
        graph_function, _, _ = self._maybe_define_function(args, kwargs)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3213 _maybe_define_function
        graph_function = self._create_graph_function(args, kwargs)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3075 _create_graph_function
        capture_by_value=self._capture_by_value),
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py:986 func_graph_from_py_func
        func_outputs = python_func(*func_args, **func_kwargs)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:600 wrapped_fn
        return weak_wrapped_fn().__wrapped__(*args, **kwds)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3735 bound_method_wrapper
        return wrapped_fn(*args, **kwargs)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py:969 wrapper
        user_requested=True,
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3697 call  **
        return wrapped_fn(self.weakrefself_target__(), *args, **kwargs)
    .\examples\tacotron2\tacotron_dataset.py:130 _load_data
        mel_length = len(mel)
    C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py:853 __len__
        "shape information.".format(self.name))

    TypeError: len is not well defined for symbolic Tensors. (PyFunc:0) Please call `x.shape` rather than `len(x)` for shape information.

It works perfectly fine with other predefined datasets (e.g. kss, libritts...). I wonder if it has to do with either the size of my dataset ("koreto"), which is pretty small (less than 1 hour total), or my Tensorflow (ver 2.3), or CUDA (ver 11.3). Or could it be something related to the incompatibility with the existing tacotron2-100k.h5? Or the accented characters unique to Esperanto (e.g., ĉ, ĝ, ĥ, ĵ, ŝ, ŭ)?

What should I do to make it work?

riproskaie commented 3 years ago

I solved the problem by running it on TF 2.4. So I'll close this issue.