netease-youdao / EmotiVoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine
Apache License 2.0
6.63k stars 555 forks source link

求助!mfa执行第6步报错,老板等着要demo,求救求救 #149

Open topsoft-support opened 2 months ago

topsoft-support commented 2 months ago

MFA Step6

已执行

mfa validate \
--overwrite \
--clean \
--single_speaker \
data/DataBaker/mfa/lab \
data/DataBaker/mfa/mfa_pronounciation_dict.txt

执行报错

mfa train \
--overwrite \
--clean \
--single_speaker \
data/DataBaker/mfa/lab \
data/DataBaker/mfa/mfa_pronounciation_dict.txt \
data/DataBaker/mfa/mfa/mfa_model.zip \
data/DataBaker/mfa/TextGrid

我是下载了freeDatasets的素材,放到了data/DataBaker/raw下,只保留了10条,用来尝试根据声音文件和文本标签来克隆, 删掉了PhoneLabeling下的内容,

000001-000010.txt

000001  卡尔普#2陪外孙#1玩滑梯#4。
ka2 er2 pu3 pei2 wai4 sun1 wan2 hua2 ti1
000002  假语村言#2别再#1拥抱我#4。
jia2 yu3 cun1 yan2 bie2 zai4 yong1 bao4 wo3
000003  宝马#1配挂#1跛骡鞍#3,貂蝉#1怨枕#2董翁榻#4。
bao2 ma3 pei4 gua4 bo3 luo2 an1 diao1 chan2 yuan4 zhen3 dong3 weng1 ta4
000004  邓小平#2与#1撒切尔#2会晤#4。
deng4 xiao3 ping2 yu3 sa4 qie4 er3 hui4 wu4
000005  老虎#1幼崽#2与#1宠物犬#1玩耍#4。
lao2 hu3 you4 zai3 yu2 chong3 wu4 quan3 wan2 shua3
000006  身长#2约#1五尺#1二寸#1五分#2或#1以上#4。
shen1 chang2 yue1 wu2 chi3 er4 cun4 wu3 fen1 huo4 yi3 shang4
000007  赵荻#2约#1曹云腾#2去#1鬼屋#4。
zhao4 di2 yue1 cao2 yun2 teng2 qu4 gui3 wu1
000008  展品#1虽有#2,展员#1却颓#4。
zhan2 pin3 sui1 you3 zhan3 yuan2 que4 tui2
000009  以#1散居#1儿童#2和#1幼托#1儿童#1为主#4。
yi2 san3 ju1 er2 tong2 he2 you4 tuo1 er2 tong2 wei2 zhu3
000010  柯特妮#2身穿#2豹纹#1大衣#4。
ke1 te4 ni1 shen1 chuan1 bao4 wen2 da4 yi1
data/
├── DataBaker
│   ├── README.md
│   ├── raw
│   │   └── BZNSYP
│   │       ├── PhoneLabeling
│   │       ├── ProsodyLabeling
│   │       │   └── 000001-000010.txt
│   │       └── Wave
│   │           ├── 000001.wav
│   │           ├── 000002.wav
│   │           ├── 000003.wav
│   │           ├── 000004.wav
│   │           ├── 000005.wav
│   │           ├── 000006.wav
│   │           ├── 000007.wav
│   │           ├── 000008.wav
│   │           ├── 000009.wav
│   │           └── 000010.wav

当执行到step6 第二条命令时就报错了,看不懂这个是什么错误

报错信息如下:

/home/ifebs/miniconda3/envs/EmotiVoiceTrain/bin/gmm-boost-silence --boost=1.0 1 /home/ifebs/Documents/MFA/lab/lda/10.mdl -
LOG (gmm-boost-silence[5.5.1112]:main():gmmbin/gmm-boost-silence.cc:93) Boosted weights for 1 pdfs, by factor of 1
/home/ifebs/miniconda3/envs/EmotiVoiceTrain/bin/gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=1
0 --retry-beam=40 --careful=false --write-per-frame-acoustic-loglikes=ark:/home/ifebs/Documents/MFA/lab/lda/like.1.3.ark - ark,s,cs:/home/ife
bs/Documents/MFA/lab/lda/fsts.1.3.ark 'ark,s,cs:splice-feats --left-context=3 --right-context=3 scp,s,cs:"/home/ifebs/Documents/MFA/lab/lab/s
plit3/feats.1.3.scp" ark:- | transform-feats "/home/ifebs/Documents/MFA/lab/lda/lda.mat" ark:- ark:- |' ark:/home/ifebs/Documents/MFA/lab/lda
/ali.1.3.ark ark,t:-
LOG (gmm-boost-silence[5.5.1112]:main():gmmbin/gmm-boost-silence.cc:103) Wrote model to -
splice-feats --left-context=3 --right-context=3 scp,s,cs:/home/ifebs/Documents/MFA/lab/lab/split3/feats.1.3.scp ark:-
transform-feats /home/ifebs/Documents/MFA/lab/lda/lda.mat ark:- ark:-
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:127) 1-7
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:617) Retrying utterance 1-7 with beam 40
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:626) Did not successfully decode file 1-7, len = 28
9
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:127) 1-8
LOG (transform-feats[5.5.1112]:main():featbin/transform-feats.cc:158) Overall average [pseudo-]logdet is -3.40272 over 880 frames.
LOG (transform-feats[5.5.1112]:main():featbin/transform-feats.cc:161) Applied transform to 3 utterances; 0 had errors.
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:617) Retrying utterance 1-8 with beam 40
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:626) Did not successfully decode file 1-8, len = 30
7
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:127) 1-9
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:617) Retrying utterance 1-9 with beam 40
WARNING (gmm-align-compiled[5.5.1112]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:626) Did not successfully decode file 1-9, len = 28
4
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan over 0 frames.
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:137) Retried 3 out of 3 utterances.
LOG (gmm-align-compiled[5.5.1112]:main():gmmbin/gmm-align-compiled.cc:139) Done 0, errors on 3
topsoft-support commented 2 months ago

@syq163 哥 求帮忙