Open pleasurepants opened 2 years ago
hi, what is **mandarin.zip**
in last pic ?
mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/
I don't have this step @
hi, what is
**mandarin.zip**
in last pic ?mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/
I don't have this step @
It's a pretrained model
hi, what is
**mandarin.zip**
in last pic ?mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/
I don't have this step @It's a pretrained model
first, use mfa train ./data dict.txt ./out -o ./out_model
to train a model with your data,
seconde, use your model in first, to align : mfa align ./data dict.txt out_model.zip ./out_model
Oh hmm, it looks like you're running this command as root
, but you're in a user's document directory. Unfortunately, by default, MFA uses an expanded ~/Documents/MFA
as the temporary directory and it seems it got expanded to /root/Documents/MFA
. You can try changing the temporary directory with the --temp_directory
flag and see if it runs better? But also might be fixed by just not running it as the root user if that's an option.
_hello there Now i'm facing some problems when i'm trying to use the dictionary to align the mandarinexample. The dictionary i'm using is produced by the command mfa g2p like that:
![image](https://user-images.githubusercontent.com/80297871/141734592-3277a45a-b684-4c19-890e-ff16bb19c6b4.png)
but when i'm using that dictionary to align the same dataset, i've found this problem
and here is the align file:
2021-11-14 20:05:09,747 - align - DEBUG - ALIGN CONFIG:
2021-11-14 20:05:09,752 - align - DEBUG - !!python/object:montreal_forced_aligner.config.align_config.AlignConfig
acoustic_scale: 0.1
beam: 100
boost_silence: 1.0
cleanup_textgrids: true
clitic_markers: "'\u2019"
compound_markers: -/
data_directory: null
debug: false
digraphs:
"[dt][sz\u0292\u0283\u0290\u0291\u0282\u0255\xE7]"
"[ao\u0254e][\u028A\u026A]"
disable_sat: false
feature_config: !!python/object:montreal_forced_aligner.config.feature_config.FeatureConfig
allow_downsample: true
allow_upsample: true
cleanup_textgrids: true
debug: false
deltas: true
fmllr: false
frame_shift: 10
high_frequency: 7800
lda: false
low_frequency: 20
overwrite: false
pitch: false
sample_frequency: 16000
snip_edges: true
splice_left_context: 3
splice_right_context: 3
type: mfcc
use_energy: false
use_mp: true
fmllr_update_type: full
initial_fmllr: true
iteration: null
multilingual_ipa: false
overwrite: false
punctuation: "\u3001\u3002\u0964\uFF0C@<>\"(),.:;\xBF?\xA1!\\&%#*~\u3010\u3011\uFF0C\
\u2026\u2025\u300C\u300D\u300E\u300F\u301D\u301F\u2033\u27E8\u27E9\u266A\u30FB\u2039\
\u203A\xAB\xBB\uFF5E\u2032$+=\u2018"
retry_beam: 400
self_loop_scale: 0.1
strip_diacritics:
"\u02D0"
"\u02D1"
"\u0329"
"\u0306"
"\u0311"
"\u032F"
"\u0361"
"\u203F"
"\u035C"
transition_scale: 1.0
use_fmllr_mp: false
use_mp: true
2021-11-14 20:05:09,753 - align - WARNING - WARNING: Using old temp directory, this might not be ideal for you, use the --clean flag to ensure no weird behavior for previous versions of the temporary directory.
2021-11-14 20:05:09,753 - align - DEBUG - Previous run ended in an error (maybe ctrl-c?)
2021-11-14 20:05:09,753 - align - DEBUG - Previous run used source directory path ../mandarin/mandarin_example (new run: ./mandarin_example)
2021-11-14 20:05:09,754 - align - DEBUG - Previous run was on 1.0.0 version (new run: 2.0.0b4.dev8+g2403bd5.d20211108)
2021-11-14 20:05:09,754 - align - DEBUG - Previous run used dictionary path ../mandarin/mandarin_mtts.lexicon.txt (new run: ./dict_train.txt)
2021-11-14 20:05:09,755 - align - DEBUG - Previous run used acoustic model path None (new run: ./mandarin.zip)
2021-11-14 20:05:09,755 - align - DEBUG -
2021-11-14 20:05:09,755 - align - DEBUG - ====ACOUSTIC MODEL INFO====
2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model root directory: /root/Documents/MFA/mandarin_example/acoustic_models
2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model dirname: /root/Documents/MFA/mandarin_example/acoustic_models/mandarin
2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model meta path: /root/Documents/MFA/mandarin_example/acoustic_models/mandarin/meta.yaml
2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model meta information:
2021-11-14 20:05:09,779 - align - DEBUG - architecture: gmm-hmm
features:
fmllr: true
frame_shift: 10
pitch: false
type: mfcc
use_energy: false
has_speaker_independent_model: false
multilingual_ipa: false
phone_type: triphone
phones: !!set
a1: null
a2: null
a3: null
a4: null
a5: null
ai1: null
ai2: null
ai3: null
ai4: null
ai5: null
ao1: null
ao2: null
ao3: null
ao4: null
ao5: null
b: null
c: null
ch: null
d: null
e1: null
e2: null
e3: null
e4: null
e5: null
ei1: null
ei2: null
ei3: null
ei4: null
f: null
g: null
h: null
i1: null
i2: null
i3: null
i4: null
i5: null
ia1: null
ia2: null
ia3: null
ia4: null
ia5: null
iao1: null
iao2: null
iao3: null
iao4: null
ie1: null
ie2: null
ie3: null
ie4: null
ie5: null
ii1: null
ii2: null
ii3: null
ii4: null
ii5: null
io1: null
io2: null
io3: null
io4: null
iou1: null
iou2: null
iou3: null
iou4: null
iu1: null
iu2: null
iu3: null
iu4: null
iu5: null
j: null
k: null
l: null
m: null
n: null
ng: null
o1: null
o2: null
o3: null
o4: null
o5: null
ou1: null
ou2: null
ou3: null
ou4: null
ou5: null
p: null
q: null
r: null
s: null
sh: null
t: null
u1: null
u2: null
u3: null
u4: null
u5: null
ua1: null
ua2: null
ua3: null
ua4: null
ua5: null
uai1: null
uai2: null
uai3: null
uai4: null
ue1: null
ue2: null
ue3: null
ue4: null
ue5: null
uei1: null
uei2: null
uei3: null
uei4: null
uei5: null
uo1: null
uo2: null
uo3: null
uo4: null
uo5: null
v1: null
v2: null
v3: null
v4: null
v5: null
va1: null
va2: null
va3: null
va4: null
ve1: null
ve2: null
ve3: null
ve4: null
x: null
z: null
zh: null
uses_lda: false
uses_sat: false
version: 1.0.0
2021-11-14 20:05:09,779 - align - DEBUG -
2021-11-14 20:05:09,779 - align - INFO - Setting up corpus information...
2021-11-14 20:05:09,784 - align - INFO - Found old run with 1 rather than the current 3, setting to 1. If you would like to use 3, re-run the command with --clean.
2021-11-14 20:05:09,784 - align - DEBUG - Loading from temporary files...
2021-11-14 20:05:09,823 - align - DEBUG - Loaded from corpus_data temp directory in 0.039043426513671875 seconds
2021-11-14 20:05:09,823 - align - DEBUG - Successfully loaded from temporary files
2021-11-14 20:05:09,830 - align - INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 6.0
2021-11-14 20:05:09,834 - align - INFO - Parsing dictionary "dict_train" without pronunciation probabilities without silence probabilities
2021-11-14 20:05:09,838 - align - DEBUG - "dict_train" DICTIONARY INFORMATION
2021-11-14 20:05:09,838 - align - DEBUG - Has NO pronunciation probabilities
2021-11-14 20:05:09,838 - align - DEBUG - Has NO silence probabilities
2021-11-14 20:05:09,838 - align - DEBUG - Grapheme set: 1, 2, 3, 4, 5, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
2021-11-14 20:05:09,838 - align - DEBUG - Phone set: a1, a2, a3, a4, ai1, ai2, ai3, ai4, ao1, ao3, ao4, b, c, ch, d, e1, e2, e3, e4, e5, ei2, ei3, ei4, f, g, h, i1, i2, i3, i4, ia1, ia2, ia3, ia4, iao1, iao3, ie3, ie4, ii1, ii2, ii3, ii4, io3, io4, iou2, iou3, iou4, iu2, j, k, l, m, n, ng, o1, o2, o3, o4, ou1, ou2, ou3, ou4, p, q, r, s, sh, t, u1, u2, u3, u4, ua1, ua2, ua3, ua4, uai4, ue1, ue2, uei2, uei4, uo1, uo2, uo4, v1, v2, v3, v4, va2, va4, ve2, ve4, x, z, zh
2021-11-14 20:05:09,838 - align - DEBUG - Punctuation: 、。।,@<>"(),.:;¿?¡!\&%#*~【】,…‥「」『』〝〟″⟨⟩♪・‹›«»~′$+=‘
2021-11-14 20:05:09,838 - align - DEBUG - Clitic markers: '’
2021-11-14 20:05:09,838 - align - DEBUG - Clitic set:
2021-11-14 20:05:09,839 - align - INFO - Creating dictionary information...
2021-11-14 20:05:09,864 - align - INFO - Setting up training data...
2021-11-14 20:05:10,034 - align - INFO - Done with setup!
2021-11-14 20:05:10,035 - align - DEBUG - Setup pretrained aligner in 0.19635796546936035 seconds
2021-11-14 20:05:10,035 - align - DEBUG - Compiling training graphs...
2021-11-14 20:05:10,183 - align - DEBUG - There were 1 kaldi processing files that had errors:
2021-11-14 20:05:10,184 - align - DEBUG -
2021-11-14 20:05:10,184 - align - DEBUG - /root/Documents/MFA/mandarin_example/align/log/align.0.log
2021-11-14 20:05:10,184 - align - DEBUG - /root/miniconda3/envs/aligner/bin/gmm-boost-silence --boost=1.0 6 /root/Documents/MFA/mandarin_example/align/final.mdl -
2021-11-14 20:05:10,184 - align - DEBUG - /root/miniconda3/envs/aligner/bin/gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=100 --retry-beam=400 --careful=false - scp:/root/Documents/MFA/mandarin_example/align/fsts.dict_train.0.scp 'ark,s,cs:add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:- | apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:- |' ark:/root/Documents/MFA/mandarin_example/align/ali.dict_train.0.ark
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:103) Wrote model to -
2021-11-14 20:05:10,184 - align - DEBUG - add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:-
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (add-deltas[5.5.985]:Open():util/kaldi-table-inl.h:106) Failed to open script file /root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp
2021-11-14 20:05:10,184 - align - DEBUG - ERROR (add-deltas[5.5.985]:SequentialTableReader():util/kaldi-table-inl.h:860) Error constructing TableReader: rspecifier is scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp
2021-11-14 20:05:10,184 - align - DEBUG - kaldi::KaldiFatalErrorapply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:-
2021-11-14 20:05:10,184 - align - DEBUG - LOG (apply-cmvn-sliding[5.5.985]:main():featbin/apply-cmvn-sliding.cc:75) Applied sliding-window cepstral mean normalization to 0 utterances, 0 had errors.
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 774-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 814-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 854-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 894-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 934-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 974-mandarin-example
2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan over 0 frames.
2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:137) Retried 0 out of 6 utterances.
2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:139) Done 0, errors on 6
2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:Close():util/kaldi-io.cc:515) Pipe add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:- | apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:- | had nonzero return status 256
What's that problem means? thanks alot