PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
11.12k stars 1.85k forks source link

TTS 小样本 finetune/声音克隆问题【This dataset has no examples】 #3248

Open NLPerxue opened 1 year ago

NLPerxue commented 1 year ago

在进行音色克隆任务微调时,使用官方给的测试样例程序能够跑通,能够生成最终结果;但是上传自己录制的数据时报错:This dataset has no examples。 (paddlespeech-gpu) [root@int-gpu-001 tts3]$./run_mix.sh --stage 0 --stop-stage 3 check oov get mfa result align.py:60: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. Setting up corpus information... Number of speakers in corpus: 1, average number of utterances per speaker: 11.0 /opt/projects/psgpu/PaddleSpeech/examples/other/tts_finetune/tts3/tools/montreal-forced-aligner/lib/aligner/models.py:87: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. Creating dictionary information... Setting up training data... Calculating MFCCs... Some utterances were ignored due to lack of features, please see /root/Documents/MFA/newdir/logging/corpus.log for more information. Calculating CMVN... Number of speakers in corpus: 1, average number of utterances per speaker: 0.0 Done with setup. 100%|####################################################################################################################################################################################################################| 2/2 [00:01<00:00, 1.46it/s] Done! Everything took 3.8460099697113037 seconds generate durations.txt extract feature /opt/servers/anaconda3/envs/paddlespeech-gpu/lib/python3.10/site-packages/pkg_resources/init.py:121: DeprecationWarning: pkg_resources is deprecated as an API warnings.warn("pkg_resources is deprecated as an API", DeprecationWarning) /opt/servers/anaconda3/envs/paddlespeech-gpu/lib/python3.10/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('mpl_toolkits'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) /opt/servers/anaconda3/envs/paddlespeech-gpu/lib/python3.10/site-packages/pkg_resources/init.py:2870: DeprecationWarning: Deprecated call to pkg_resources.declare_namespace('google'). Implementing implicit namespace packages (as specified in PEP 420) is preferred to pkg_resources.declare_namespace. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages declare_namespace(pkg) /opt/servers/anaconda3/envs/paddlespeech-gpu/lib/python3.10/site-packages/librosa/core/constantq.py:1059: DeprecationWarning: np.complex is a deprecated alias for the builtin complex. To silence this warning, use complex by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use np.complex128 here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations dtype=np.complex, 9 1 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:00<00:00, 12714.29it/s] Done Traceback (most recent call last): File "/opt/projects/psgpu/PaddleSpeech/examples/other/tts_finetune/tts3/local/extract_feature.py", line 346, in extract_feature( File "/opt/projects/psgpu/PaddleSpeech/examples/other/tts_finetune/tts3/local/extract_feature.py", line 266, in extract_feature normalize(speech_scaler, pitch_scaler, energy_scaler, vocab_phones, File "/opt/projects/psgpu/PaddleSpeech/examples/other/tts_finetune/tts3/local/extract_feature.py", line 155, in normalize dataset = DataTable( File "/opt/projects/psgpu/PaddleSpeech/paddlespeech/t2s/datasets/data_table.py", line 47, in init assert len(data) > 0, "This dataset has no examples" AssertionError: This dataset has no examples

NLPerxue commented 1 year ago

image 这是音频相关信息;

NLPerxue commented 1 year ago

查看运行结果:生成的durations.txt文件为空;

yaleimeng commented 1 year ago

你的样本太少了,很可能连一个batch_size都没达到,提示是某个dataset 没有样本。 建议你参考样例数据集,至少提供200个声音样本。其实可以更多,越多越好。

NLPerxue commented 1 year ago

感谢!我改一下batch_size试试,但我我有个疑问,为什么测试数据12条是没问题呢?我上传的数据和官方给的测试样例数据量是相同的;

你的样本太少了,很可能连一个batch_size都没达到,提示是某个dataset 没有样本。 建议你参考样例数据集,至少提供200个声音样本。其实可以更多,越多越好。

NLPerxue commented 1 year ago

你的样本太少了,很可能连一个batch_size都没达到,提示是某个dataset 没有样本。 建议你参考样例数据集,至少提供200个声音样本。其实可以更多,越多越好。

我找到原因了,是样本问题。yi'xia以下这种格式是可以的。 image

joisonwk commented 1 year ago

我的690条也是这个错误,说要把nltk_data下载到home目录,我也下载解压了,还是这报这个错误,谁知道怎么解决吗?

joisonwk commented 1 year ago

run.sh有个坑, 难道就我踩到了,mfa_result生的TextGrid必须要再创建一个目录复制进去,才能在generate duration 和 extract feature时成功生成对应的文件,不然为空。 现在又来另外一个问题有谁知道如何解决吗?

./run.sh --stage 5 --stop-stage 5 finetune... rank: 0, pid: 13698, parent_pid: 13686 multiple speaker fastspeech2! spk_num: 174 samplers done! dataloaders done! vocab_size: 306 W0626 22:32:50.563964 13698 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 12.1, Runtime API Version: 11.7 W0626 22:32:50.564774 13698 gpu_resources.cc:149] device: 0, cuDNN Version: 8.9. I0626 22:32:53.255448 13698 eager_method.cc:143] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. I0626 22:32:53.255864 13698 eager_method.cc:143] Warning:: 0D Tensor cannot be used as 'Tensor.numpy()[0]' . In order to avoid this problem, 0D Tensor will be changed to 1D numpy currently, but it's not correct and will be removed in release 2.6. For Tensor contain only one element, Please modify 'Tensor.numpy()[0]' to 'float(Tensor)' as soon as possible, otherwise 'Tensor.numpy()[0]' will raise error in release 2.6. model done! optimizer done! /home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/nn/layer/layers.py:1896: UserWarning: Skip loading for encoder.embed.1.alpha. encoder.embed.1.alpha receives a shape [1], but the expected shape is []. warnings.warn(f"Skip loading for {key}. " + str(err)) /home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/nn/layer/layers.py:1896: UserWarning: Skip loading for decoder.embed.0.alpha. decoder.embed.0.alpha receives a shape [1], but the expected shape is []. warnings.warn(f"Skip loading for {key}. " + str(err)) /home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/nn/layer/norm.py:776: UserWarning: When training, we now always track global mean and variance. warnings.warn( Exception in main training loop: Variable Shape not match, Variable [ create_parameter_3.w_0_moment1_0 ] need tensor with shape [] but load set tensor with shape [1] Traceback (most recent call last): File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/training/trainer.py", line 149, in run update() File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/training/updaters/standard_updater.py", line 110, in update self.update_core(batch) File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/models/fastspeech2/fastspeech2_updater.py", line 118, in update_core optimizer.step() File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/dygraph/base.py", line 334, in impl return func(*args, *kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/framework.py", line 462, in impl return func(*args, *kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 446, in step optimize_ops = self._apply_optimize( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 1242, in _apply_optimize optimize_ops = self._create_optimization_pass( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 994, in _create_optimization_pass self._create_accumulators( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 278, in _create_accumulators self._add_moments_pows(p) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 231, in _add_moments_pows self._add_accumulator(self._moment1_acc_str, p, dtype=acc_dtype) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 799, in _add_accumulator var.set_value(self._accumulators_holder.pop(var_name)) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/framework.py", line 449, in impl return func(*args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/dygraph/tensor_patch_methods.py", line 196, in set_value assert self.shape == list( Trainer extensions will try to handle the extension. Then all extensions will finalize.Traceback (most recent call last): File "local/finetune.py", line 269, in train_sp(train_args, config) File "local/finetune.py", line 202, in train_sp trainer.run() File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/training/trainer.py", line 198, in run six.reraise(exc_info) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/six.py", line 719, in reraise raise value File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/training/trainer.py", line 149, in run update() File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/training/updaters/standard_updater.py", line 110, in update self.update_core(batch) File "/home/ant/voice/PaddleSpeech/paddlespeech/t2s/models/fastspeech2/fastspeech2_updater.py", line 118, in update_core optimizer.step() File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, (extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/dygraph/base.py", line 334, in impl return func(*args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, *kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/framework.py", line 462, in impl return func(args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 446, in step optimize_ops = self._apply_optimize( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 1242, in _apply_optimize optimize_ops = self._create_optimization_pass( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 994, in _create_optimization_pass self._create_accumulators( File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 278, in _create_accumulators self._add_moments_pows(p) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/adam.py", line 231, in _add_moments_pows self._add_accumulator(self._moment1_acc_str, p, dtype=acc_dtype) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/optimizer/optimizer.py", line 799, in _add_accumulator var.set_value(self._accumulators_holder.pop(var_name)) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/decorator.py", line 232, in fun return caller(func, *(extras + args), kw) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/wrapped_decorator.py", line 25, in impl return wrapped_func(*args, *kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/framework.py", line 449, in impl return func(args, kwargs) File "/home/ant/voice/PaddleSpeech/tools/venv/lib/python3.8/site-packages/paddle/fluid/dygraph/tensor_patch_methods.py", line 196, in set_value assert self.shape == list( AssertionError: Variable Shape not match, Variable [ create_parameter_3.w_0_moment1_0 ] need tensor with shape [] but load set tensor with shape [1]

yangpyoung commented 1 year ago

你好,请问你“AssertionError: Variable Shape not match, Variable [ create_parameter_3.w_0_moment1_0 ] need tensor with shape [] but load set tensor with shape [1]” 这个问题解决了吗?我也遇到相同问题了

iamfoolberg commented 1 year ago

AssertionError: Variable Shape not match, Variable [ create_parameter_3.w_0_moment1_0 ] need tensor with shape [] but load set tensor with shape [1]

同坑。。。

ChengsongLu commented 1 year ago

删掉exp文件夹,在单独跑一次stage5就可以