Closed wangkewk closed 3 years ago
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:synthesizer/utils/symbols.py
第11行的内容 改为:
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
放心
同样的问题!
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,这是有效的。修改过之后,原来的纯杂音变成正常声音了
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,问题解决了
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
谢谢,已经解决
同样问题
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
感谢!问题已顺利解决。
一样!
修改后完全正常,thanks~
+1
修改后正常了,感谢
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?
同样的问题。
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的symbols 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容如果使用自己训练的模型 要把这个改回去才有效吗 还是不用改也行 我试了下新训练的没声音(也有可能是自己训练的问题)但是用给的模型是正常
改回去效果会好一点 但是不改也可以工作的
总算可以了,这个问题搞了好久,还以为本地安装的环境问题
+1
出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?
出来的声音像机器人的声音,是因为不同的电脑环境出来的效果不一样么?那是否得自己重新训练模型?
不是的,可能是vocoder或者输入音频不同导致的
+1
唉,还是没有视频中的效果,听起来像刚来中国的老外的塑料中文
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?
我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文
问题确实解决了,但是声音质量没有哔哩哔哩的效果好,我特意找到的小说的录音,不知道是哪里有问题。 如果想要声音特别像某个人的声音,要怎么提高呢?
我也用的B站up主的模型,但是没有bilibili中的效果,我那边听起来像伏拉夫的调调,都不像中文
如果录音清晰,平调情况下音色复制效果还是可以的,是不是哪里没运行好?
已修改synthesizer/utils/symbols.py
,还是出现报错
Synthesizer using device: cuda
Trainable Parameters: 32.735M
Traceback (most recent call last):
File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda>
func = lambda: self.synthesize() or self.vocode()
File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize
specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token)
File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms
self.load()
File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load
self._model.load(self.model_fpath)
File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load
self.load_state_dict(checkpoint["model_state"], strict=False)
File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]).
size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]).
size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]).
size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
Me too
试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享
试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享
可以运行起来了,但是生成的句子只有前半是读出来的,后半句都是杂音,多生成几次有时会好点有时又会倒退回去,而且生成的声音和原音频不像,差的有点远的那种,哈哈
试着用这个模型: 链接:https://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw 提取码:om7f --来自百度网盘超级会员V3的分享
这个模型没问题,把_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '改回原来的就行了
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
跑的步数很少,可以延续跑到100k+
ceshi的模型需要将代码切换到10月20号左右的commit之后,再按issue #37 修改之后就可以用了 而作者的模型,需要将代码切换到10月20号左右的commit之后使用
蔓用这个模型: 链接:https ://pan.baidu.com/s/1fMh9IlgKJlL2PIiRTYDUvw提取码:om7f --来自百度网盘超级会员V3的分享
这个可以解决了,但拿演示音频测试,生成的差了好多emmm
跑的步数很少,可以延续跑到100k+ 是不断的点synthesize only之后,输出的声音就会越来越好吗?
这个是我最近一个修复导致的不兼容问题, 你可以把文件中:
synthesizer/utils/symbols.py
第11行的内容 改为:_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz12340!\'(),-.:;? '
即可。暂时先不要关闭这个issue吧。我看下遇到的人太多的话我做个兼容
改了之后还是没用,,,,希望再看看
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
我觉得你这个估计是一开始你复制了模型到你的程序里面去了,重新解压一下那个程序的压缩包,然后重新来就可以了
为什么我的源音频是黑色的,有大佬知道吗?
源音频的Dataset和Speaker这些都是黑的,不能选择?
没有被识别的数据集 不训练的话就不用理会了
大佬,是不是如果要克隆自己的声音的话,需要对自己做音源进行训练,而不能直接用community给的那些模型。昨天用给的模型(包括synthesizer和vector)克隆自己的录音,结果出来的梅尔频谱图是杂乱的,只有一堆电流声和噪声,求大佬指正错误
我也一样啊.程序逻辑有声波频率上的错误.
------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2022年1月5日(星期三) 下午3:22 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [babysor/MockingBird] 用这里的模型跑出现这个RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]). (#37)
大佬,是不是如果要克隆自己的声音的话,需要对自己做音源进行训练,而不能直接用community给的那些模型。昨天用给的模型(包括synthesizer和vector)克隆自己的录音,结果出来的梅尔频谱图是杂乱的,只有一堆电流声和噪声,求大佬指正错误
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>
可以直接通过 quickstart (https://github.com/babysor/MockingBird/wiki/Quick-Start-(Newbie))改用该模型,相关代码可以无需修改 ; 环境 3.7.11
整篇评论都看了,raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(。生成的都是杂音,代码也照着改了都不行。换模型也不行。。。
同样的错误copying a param with shape torch.Size([128, 512]) ,输出的声音全部是杂音
纯萌新,请教一下切换到tag0.01怎么切换啊?完全没理解。 自己拿75k的训练了一阵目标语音,感觉模仿的声音还是不像,想换这个模型再训练试试
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
还没xd
------------------ 原始邮件 ------------------ 发件人: @.>; 发送时间: 2022年1月24日(星期一) 晚上6:51 收件人: @.>; 抄送: @.>; @.>; 主题: Re: [babysor/MockingBird] 用这里的模型跑出现这个RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]). (#37)
已修改synthesizer/utils/symbols.py,还是出现报错
Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox__init.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.class.name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you commented.Message ID: @.***>
只输出杂音,按照评论来改了还是一样
已修改
synthesizer/utils/symbols.py
,还是出现报错Synthesizer using device: cuda Trainable Parameters: 32.735M Traceback (most recent call last): File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 123, in <lambda> func = lambda: self.synthesize() or self.vocode() File "D:\AI\sv2tts_china\MockingBird\toolbox\__init__.py", line 238, in synthesize specs = self.synthesizer.synthesize_spectrograms(texts, embeds, style_idx=int(self.ui.style_slider.value()), min_stop_token=min_token) File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 87, in synthesize_spectrograms self.load() File "D:\AI\sv2tts_china\MockingBird\synthesizer\inference.py", line 65, in load self._model.load(self.model_fpath) File "D:\AI\sv2tts_china\MockingBird\synthesizer\models\tacotron.py", line 525, in load self.load_state_dict(checkpoint["model_state"], strict=False) File "D:\ProgramData\Anaconda3\envs\Real-Time-Voice-Cloning\lib\site-packages\torch\nn\modules\module.py", line 1483, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).
同样的报错 你那个好了吗?
版本先切换,再应用#37
谁能解决