:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
I constructed a dataset of Esperanto speech under the name "koreto." I'd like to make it work on TensorFlowTTS. My forked repository is available here: https://github.com/riproskaie/TensorFlowTTS
I created/modified the following files:
tensorflow_tts/bin/preprocess.py
tensorflow_tts/processor/koreto.py
tensorflow_tts/processor/init.py
tensorflow_tts/configs/tacotron2.py
tensorflow_tts/inference/auto_processor.py
examples/tacotron2/conf/tacotron2.koreto.v1.yaml
preprocess/koreto_preprocess.yaml
setup.py
Afterwards, I tried training Tacotron2, by typing:
!python ./examples/tacotron2/train_tacotron2.py --train-dir ./dump_koreto/train/ --dev-dir ./dump_koreto/valid/ --outdir ./examples/tacotron2/exp/train.tacotron2.koreto.v1/ --config ./examples/tacotron2/conf/tacotron2.koreto.v1.yaml --use-norm 1 --mixed_precision 0 --resume ""
And then I get this error:
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: hop_size = 256
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: format = npy
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: model_type = tacotron2
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: tacotron2_params = {'dataset': 'koreto', 'embedding_hidden_size': 512, 'initializer_range': 0.02, 'embedding_dropout_prob': 0.1, 'n_speakers': 1, 'n_conv_encoder': 5, 'encoder_conv_filters': 512, 'encoder_conv_kernel_sizes': 5, 'encoder_conv_activation': 'relu', 'encoder_conv_dropout_rate': 0.5, 'encoder_lstm_units': 256, 'n_prenet_layers': 2, 'prenet_units': 256, 'prenet_activation': 'relu', 'prenet_dropout_rate': 0.5, 'n_lstm_decoder': 1, 'reduction_factor': 1, 'decoder_lstm_units': 1024, 'attention_dim': 128, 'attention_filters': 32, 'attention_kernel': 31, 'n_mels': 80, 'n_conv_postnet': 5, 'postnet_conv_filters': 512, 'postnet_conv_kernel_sizes': 5, 'postnet_dropout_rate': 0.1, 'attention_type': 'lsa'}
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: batch_size = 8
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: remove_short_samples = True
2021-10-24 01:01:26,665 (train_tacotron2:421) INFO: allow_cache = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: mel_length_threshold = 32
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: is_shuffle = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: use_fixed_shapes = True
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: optimizer_params = {'initial_learning_rate': 0.001, 'end_learning_rate': 1e-05, 'decay_steps': 150000, 'warmup_proportion': 0.02, 'weight_decay': 0.001}
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: gradient_accumulation_steps = 1
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: var_train_expr = None
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: train_max_steps = 200000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: save_interval_steps = 2000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: eval_interval_steps = 500
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: log_interval_steps = 200
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: start_schedule_teacher_forcing = 200001
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: start_ratio_value = 0.5
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: schedule_decay_steps = 50000
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: end_ratio_value = 0.0
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: num_save_intermediate_results = 1
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: train_dir = ./dump_koreto/train/
2021-10-24 01:01:26,666 (train_tacotron2:421) INFO: dev_dir = ./dump_koreto/valid/
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: use_norm = True
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: outdir = ./examples/tacotron2/exp/train.tacotron2.koreto.v1/
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: config = ./examples/tacotron2/conf/tacotron2.koreto.v1.yaml
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: resume =
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: verbose = 1
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: mixed_precision = False
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: pretrained =
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: version = 0.0
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: max_mel_length = 2173
2021-10-24 01:01:26,667 (train_tacotron2:421) INFO: max_char_length = 191
2021-10-24 01:01:26,798 (ag_logging:146) WARNING: AutoGraph could not transform <bound method CharactorMelDataset._load_data of <tensorflow.python.eager.function.TfMethodTarget object at 0x000001EC039FC240>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2021-10-24 01:01:07.028137: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:13.952536: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2021-10-24 01:01:14.550550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2021-10-24 01:01:14.550612: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:14.713712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-10-24 01:01:14.758645: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-10-24 01:01:14.762322: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-10-24 01:01:14.799377: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-10-24 01:01:14.803781: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-10-24 01:01:14.865158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-10-24 01:01:15.023366: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-10-24 01:01:21.583264: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-10-24 01:01:21.941158: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1ec73a88f30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-10-24 01:01:21.941212: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-10-24 01:01:22.420068: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:01:00.0 name: NVIDIA GeForce GTX 1050 computeCapability: 6.1
coreClock: 1.493GHz coreCount: 5 deviceMemorySize: 2.00GiB deviceMemoryBandwidth: 104.43GiB/s
2021-10-24 01:01:22.420149: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2021-10-24 01:01:22.420182: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2021-10-24 01:01:22.420213: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2021-10-24 01:01:22.420243: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2021-10-24 01:01:22.420274: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2021-10-24 01:01:22.420305: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2021-10-24 01:01:22.420338: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2021-10-24 01:01:22.420458: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2021-10-24 01:01:23.495539: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-10-24 01:01:23.495569: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2021-10-24 01:01:23.495586: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2021-10-24 01:01:23.495996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1329 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2021-10-24 01:01:23.591156: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1ec10931780 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-10-24 01:01:23.591193: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA GeForce GTX 1050, Compute Capability 6.1
WARNING:tensorflow:AutoGraph could not transform <bound method CharactorMelDataset._load_data of <tensorflow.python.eager.function.TfMethodTarget object at 0x000001EC039FC240>> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module 'gast' has no attribute 'Index'
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Traceback (most recent call last):
File "./examples/tacotron2/train_tacotron2.py", line 513, in <module>
main()
File "./examples/tacotron2/train_tacotron2.py", line 428, in main
* config["gradient_accumulation_steps"],
File ".\examples\tacotron2\tacotron_dataset.py", line 186, in create
tf.data.experimental.AUTOTUNE
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 1702, in map
preserve_cardinality=True)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 4084, in __init__
use_legacy_function=use_legacy_function)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3371, in __init__
self._function = wrapper_fn.get_concrete_function()
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 2939, in get_concrete_function
*args, **kwargs)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 2906, in _get_concrete_function_garbage_collected
graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 3213, in _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py", line 3075, in _create_graph_function
capture_by_value=self._capture_by_value),
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py", line 986, in func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3364, in wrapper_fn
ret = _wrapper_helper(*args)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3299, in _wrapper_helper
ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
File "C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\autograph\impl\api.py", line 258, in wrapper
raise e.ag_error_metadata.to_exception(e)
TypeError: in user code:
.\examples\tacotron2\tacotron_dataset.py:185 None *
lambda items: self._load_data(items),
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:780 __call__ **
result = self._call(*args, **kwds)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:823 _call
self._initialize(args, kwds, add_initializers_to=initializers)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:697 _initialize
*args, **kwds))
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:2855 _get_concrete_function_internal_garbage_collected
graph_function, _, _ = self._maybe_define_function(args, kwargs)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3213 _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3075 _create_graph_function
capture_by_value=self._capture_by_value),
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py:986 func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\def_function.py:600 wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3735 bound_method_wrapper
return wrapped_fn(*args, **kwargs)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\func_graph.py:969 wrapper
user_requested=True,
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\eager\function.py:3697 call **
return wrapped_fn(self.weakrefself_target__(), *args, **kwargs)
.\examples\tacotron2\tacotron_dataset.py:130 _load_data
mel_length = len(mel)
C:\Users\User\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py:853 __len__
"shape information.".format(self.name))
TypeError: len is not well defined for symbolic Tensors. (PyFunc:0) Please call `x.shape` rather than `len(x)` for shape information.
It works perfectly fine with other predefined datasets (e.g. kss, libritts...). I wonder if it has to do with either the size of my dataset ("koreto"), which is pretty small (less than 1 hour total), or my Tensorflow (ver 2.3), or CUDA (ver 11.3). Or could it be something related to the incompatibility with the existing tacotron2-100k.h5? Or the accented characters unique to Esperanto (e.g., ĉ, ĝ, ĥ, ĵ, ŝ, ŭ)?
I constructed a dataset of Esperanto speech under the name "koreto." I'd like to make it work on TensorFlowTTS. My forked repository is available here: https://github.com/riproskaie/TensorFlowTTS
I created/modified the following files:
Afterwards, I tried training Tacotron2, by typing:
!python ./examples/tacotron2/train_tacotron2.py --train-dir ./dump_koreto/train/ --dev-dir ./dump_koreto/valid/ --outdir ./examples/tacotron2/exp/train.tacotron2.koreto.v1/ --config ./examples/tacotron2/conf/tacotron2.koreto.v1.yaml --use-norm 1 --mixed_precision 0 --resume ""
And then I get this error:
It works perfectly fine with other predefined datasets (e.g. kss, libritts...). I wonder if it has to do with either the size of my dataset ("koreto"), which is pretty small (less than 1 hour total), or my Tensorflow (ver 2.3), or CUDA (ver 11.3). Or could it be something related to the incompatibility with the existing tacotron2-100k.h5? Or the accented characters unique to Esperanto (e.g., ĉ, ĝ, ĥ, ĵ, ŝ, ŭ)?
What should I do to make it work?