openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.73k stars 288 forks source link

[MFA] Transform matrix for utterance 1-1 has bad dimension 40x112 versus feat dim 105 #139

Closed idootop closed 1 year ago

idootop commented 1 year ago

按照飞书文档中的教程,下载 mfa-opencpop-extension.zip 和 mfa-opencpop-extension.txt,尝试使用 MFA 对齐人声时报错:

mfa version: 2.2.15

中文人声样本:test.zip

montreal_forced_aligner.exceptions.KaldiProcessingError: KaldiProcessingError:

There were 1 job(s) with errors when running Kaldi binaries.
See the log files below for more information.
/Users/del/Documents/MFA/mfa/alignment/log/align.1.log
/opt/homebrew/Caskroom/miniforge/base/envs/songmass/bin/gmm-boost-silence --boost=1.0 1 /Users/del/Documents/MFA/mfa/alignment/final.alimdl -
/opt/homebrew/Caskroom/miniforge/base/envs/songmass/bin/gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.083333 --self-loop-scale=0.1 --beam=10 --retry-beam=40 --careful=false --write-per-frame-acoustic-loglikes=ark:/Users/del/Documents/MFA/mfa/alignment/like.1.1.ark - ark,s,cs:/Users/del/Documents/MFA/mfa/alignment/fsts.1.1.ark 'ark,s,cs:splice-feats --left-context=3 --right-context=3 scp,s,cs:"/Users/del/Documents/MFA/mfa/mfa/split3/feats.1.1.scp" ark:- | transform-feats "/Users/del/Documents/MFA/mfa/alignment/lda.mat" ark:- ark:- |' ark:/Users/del/Documents/MFA/mfa/alignment/ali.1.1.ark ark,t:-
WARNING (gmm-boost-silence[5.5.1068]:main():gmmbin/gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
LOG (gmm-boost-silence[5.5.1068]:main():gmmbin/gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
LOG (gmm-boost-silence[5.5.1068]:main():gmmbin/gmm-boost-silence.cc:103) Wrote model to -
transform-feats /Users/del/Documents/MFA/mfa/alignment/lda.mat ark:- ark:-
splice-feats --left-context=3 --right-context=3 scp,s,cs:/Users/del/Documents/MFA/mfa/mfa/split3/feats.1.1.scp ark:-
WARNING (transform-feats[5.5.1068]:main():featbin/transform-feats.cc:110) Transform matrix for utterance 1-1 has bad dimension 40x112 versus feat dim 105
WARNING (transform-feats[5.5.1068]:main():featbin/transform-feats.cc:110) Transform matrix for utterance 1-2 has bad dimension 40x112 versus feat dim 105
LOG (transform-feats[5.5.1068]:main():featbin/transform-feats.cc:161) Applied transform to 0 utterances; 2 had errors.
WARNING (gmm-align-compiled[5.5.1068]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 1-1
WARNING (gmm-align-compiled[5.5.1068]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 1-2
LOG (gmm-align-compiled[5.5.1068]:main():gmmbin/gmm-align-compiled.cc:135) Overall log-likelihood per frame is nan over 0 frames.
LOG (gmm-align-compiled[5.5.1068]:main():gmmbin/gmm-align-compiled.cc:137) Retried 0 out of 2 utterances.
LOG (gmm-align-compiled[5.5.1068]:main():gmmbin/gmm-align-compiled.cc:139) Done 0, errors on 2
WARNING (gmm-align-compiled[5.5.1068]:Close():util/kaldi-io.cc:515) Pipe splice-feats --left-context=3 --right-context=3 scp,s,cs:"/Users/del/Documents/MFA/mfa/mfa/split3/feats.1.1.scp" ark:- | transform-feats "/Users/del/Documents/MFA/mfa/alignment/lda.mat" ark:- ark:- | had nonzero return status 256