MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.26k stars 242 forks source link

Questions about Kaldi Binaries #353

Open pleasurepants opened 2 years ago

pleasurepants commented 2 years ago

_hello there Now i'm facing some problems when i'm trying to use the dictionary to align the mandarinexample. The dictionary i'm using is produced by the command mfa g2p like that: image image

but when i'm using that dictionary to align the same dataset, i've found this problem image and here is the align file: 2021-11-14 20:05:09,747 - align - DEBUG - ALIGN CONFIG:

2021-11-14 20:05:09,752 - align - DEBUG - !!python/object:montreal_forced_aligner.config.align_config.AlignConfig

acoustic_scale: 0.1

beam: 100

boost_silence: 1.0

cleanup_textgrids: true

clitic_markers: "'\u2019"

compound_markers: -/

data_directory: null

debug: false

digraphs:

disable_sat: false

feature_config: !!python/object:montreal_forced_aligner.config.feature_config.FeatureConfig

allow_downsample: true

allow_upsample: true

cleanup_textgrids: true

debug: false

deltas: true

fmllr: false

frame_shift: 10

high_frequency: 7800

lda: false

low_frequency: 20

overwrite: false

pitch: false

sample_frequency: 16000

snip_edges: true

splice_left_context: 3

splice_right_context: 3

type: mfcc

use_energy: false

use_mp: true

fmllr_update_type: full

initial_fmllr: true

iteration: null

multilingual_ipa: false

overwrite: false

punctuation: "\u3001\u3002\u0964\uFF0C@<>\"(),.:;\xBF?\xA1!\\&%#*~\u3010\u3011\uFF0C\

\u2026\u2025\u300C\u300D\u300E\u300F\u301D\u301F\u2033\u27E8\u27E9\u266A\u30FB\u2039\

\u203A\xAB\xBB\uFF5E\u2032$+=\u2018"

retry_beam: 400

self_loop_scale: 0.1

strip_diacritics:

transition_scale: 1.0

use_fmllr_mp: false

use_mp: true

2021-11-14 20:05:09,753 - align - WARNING - WARNING: Using old temp directory, this might not be ideal for you, use the --clean flag to ensure no weird behavior for previous versions of the temporary directory.

2021-11-14 20:05:09,753 - align - DEBUG - Previous run ended in an error (maybe ctrl-c?)

2021-11-14 20:05:09,753 - align - DEBUG - Previous run used source directory path ../mandarin/mandarin_example (new run: ./mandarin_example)

2021-11-14 20:05:09,754 - align - DEBUG - Previous run was on 1.0.0 version (new run: 2.0.0b4.dev8+g2403bd5.d20211108)

2021-11-14 20:05:09,754 - align - DEBUG - Previous run used dictionary path ../mandarin/mandarin_mtts.lexicon.txt (new run: ./dict_train.txt)

2021-11-14 20:05:09,755 - align - DEBUG - Previous run used acoustic model path None (new run: ./mandarin.zip)

2021-11-14 20:05:09,755 - align - DEBUG -

2021-11-14 20:05:09,755 - align - DEBUG - ====ACOUSTIC MODEL INFO====

2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model root directory: /root/Documents/MFA/mandarin_example/acoustic_models

2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model dirname: /root/Documents/MFA/mandarin_example/acoustic_models/mandarin

2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model meta path: /root/Documents/MFA/mandarin_example/acoustic_models/mandarin/meta.yaml

2021-11-14 20:05:09,755 - align - DEBUG - Acoustic model meta information:

2021-11-14 20:05:09,779 - align - DEBUG - architecture: gmm-hmm

features:

fmllr: true

frame_shift: 10

pitch: false

type: mfcc

use_energy: false

has_speaker_independent_model: false

multilingual_ipa: false

phone_type: triphone

phones: !!set

a1: null

a2: null

a3: null

a4: null

a5: null

ai1: null

ai2: null

ai3: null

ai4: null

ai5: null

ao1: null

ao2: null

ao3: null

ao4: null

ao5: null

b: null

c: null

ch: null

d: null

e1: null

e2: null

e3: null

e4: null

e5: null

ei1: null

ei2: null

ei3: null

ei4: null

f: null

g: null

h: null

i1: null

i2: null

i3: null

i4: null

i5: null

ia1: null

ia2: null

ia3: null

ia4: null

ia5: null

iao1: null

iao2: null

iao3: null

iao4: null

ie1: null

ie2: null

ie3: null

ie4: null

ie5: null

ii1: null

ii2: null

ii3: null

ii4: null

ii5: null

io1: null

io2: null

io3: null

io4: null

iou1: null

iou2: null

iou3: null

iou4: null

iu1: null

iu2: null

iu3: null

iu4: null

iu5: null

j: null

k: null

l: null

m: null

n: null

ng: null

o1: null

o2: null

o3: null

o4: null

o5: null

ou1: null

ou2: null

ou3: null

ou4: null

ou5: null

p: null

q: null

r: null

s: null

sh: null

t: null

u1: null

u2: null

u3: null

u4: null

u5: null

ua1: null

ua2: null

ua3: null

ua4: null

ua5: null

uai1: null

uai2: null

uai3: null

uai4: null

ue1: null

ue2: null

ue3: null

ue4: null

ue5: null

uei1: null

uei2: null

uei3: null

uei4: null

uei5: null

uo1: null

uo2: null

uo3: null

uo4: null

uo5: null

v1: null

v2: null

v3: null

v4: null

v5: null

va1: null

va2: null

va3: null

va4: null

ve1: null

ve2: null

ve3: null

ve4: null

x: null

z: null

zh: null

uses_lda: false

uses_sat: false

version: 1.0.0

2021-11-14 20:05:09,779 - align - DEBUG -

2021-11-14 20:05:09,779 - align - INFO - Setting up corpus information...

2021-11-14 20:05:09,784 - align - INFO - Found old run with 1 rather than the current 3, setting to 1. If you would like to use 3, re-run the command with --clean.

2021-11-14 20:05:09,784 - align - DEBUG - Loading from temporary files...

2021-11-14 20:05:09,823 - align - DEBUG - Loaded from corpus_data temp directory in 0.039043426513671875 seconds

2021-11-14 20:05:09,823 - align - DEBUG - Successfully loaded from temporary files

2021-11-14 20:05:09,830 - align - INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 6.0

2021-11-14 20:05:09,834 - align - INFO - Parsing dictionary "dict_train" without pronunciation probabilities without silence probabilities

2021-11-14 20:05:09,838 - align - DEBUG - "dict_train" DICTIONARY INFORMATION

2021-11-14 20:05:09,838 - align - DEBUG - Has NO pronunciation probabilities

2021-11-14 20:05:09,838 - align - DEBUG - Has NO silence probabilities

2021-11-14 20:05:09,838 - align - DEBUG - Grapheme set: 1, 2, 3, 4, 5, a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z

2021-11-14 20:05:09,838 - align - DEBUG - Phone set: a1, a2, a3, a4, ai1, ai2, ai3, ai4, ao1, ao3, ao4, b, c, ch, d, e1, e2, e3, e4, e5, ei2, ei3, ei4, f, g, h, i1, i2, i3, i4, ia1, ia2, ia3, ia4, iao1, iao3, ie3, ie4, ii1, ii2, ii3, ii4, io3, io4, iou2, iou3, iou4, iu2, j, k, l, m, n, ng, o1, o2, o3, o4, ou1, ou2, ou3, ou4, p, q, r, s, sh, t, u1, u2, u3, u4, ua1, ua2, ua3, ua4, uai4, ue1, ue2, uei2, uei4, uo1, uo2, uo4, v1, v2, v3, v4, va2, va4, ve2, ve4, x, z, zh

2021-11-14 20:05:09,838 - align - DEBUG - Punctuation: 、。।,@<>"(),.:;¿?¡!\&%#*~【】,…‥「」『』〝〟″⟨⟩♪・‹›«»~′$+=‘

2021-11-14 20:05:09,838 - align - DEBUG - Clitic markers: '’

2021-11-14 20:05:09,838 - align - DEBUG - Clitic set:

2021-11-14 20:05:09,839 - align - INFO - Creating dictionary information...

2021-11-14 20:05:09,864 - align - INFO - Setting up training data...

2021-11-14 20:05:10,034 - align - INFO - Done with setup!

2021-11-14 20:05:10,035 - align - DEBUG - Setup pretrained aligner in 0.19635796546936035 seconds

2021-11-14 20:05:10,035 - align - DEBUG - Compiling training graphs...

2021-11-14 20:05:10,183 - align - DEBUG - There were 1 kaldi processing files that had errors:

2021-11-14 20:05:10,184 - align - DEBUG -

2021-11-14 20:05:10,184 - align - DEBUG - /root/Documents/MFA/mandarin_example/align/log/align.0.log

2021-11-14 20:05:10,184 - align - DEBUG - /root/miniconda3/envs/aligner/bin/gmm-boost-silence --boost=1.0 6 /root/Documents/MFA/mandarin_example/align/final.mdl -

2021-11-14 20:05:10,184 - align - DEBUG - /root/miniconda3/envs/aligner/bin/gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=100 --retry-beam=400 --careful=false - scp:/root/Documents/MFA/mandarin_example/align/fsts.dict_train.0.scp 'ark,s,cs:add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:- | apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:- |' ark:/root/Documents/MFA/mandarin_example/align/ali.dict_train.0.ark

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)

2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1

2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-boost-silence[5.5.985]:main():gmmbin/gmm-boost-silence.cc:103) Wrote model to -

2021-11-14 20:05:10,184 - align - DEBUG - add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:-

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (add-deltas[5.5.985]:Open():util/kaldi-table-inl.h:106) Failed to open script file /root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp

2021-11-14 20:05:10,184 - align - DEBUG - ERROR (add-deltas[5.5.985]:SequentialTableReader():util/kaldi-table-inl.h:860) Error constructing TableReader: rspecifier is scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp

2021-11-14 20:05:10,184 - align - DEBUG - kaldi::KaldiFatalErrorapply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:-

2021-11-14 20:05:10,184 - align - DEBUG - LOG (apply-cmvn-sliding[5.5.985]:main():featbin/apply-cmvn-sliding.cc:75) Applied sliding-window cepstral mean normalization to 0 utterances, 0 had errors.

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 774-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 814-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 854-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 894-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 934-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:103) No features for utterance 974-mandarin-example

2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan over 0 frames.

2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:137) Retried 0 out of 6 utterances.

2021-11-14 20:05:10,184 - align - DEBUG - LOG (gmm-align-compiled[5.5.985]:main():gmmbin/gmm-align-compiled.cc:139) Done 0, errors on 6

2021-11-14 20:05:10,184 - align - DEBUG - WARNING (gmm-align-compiled[5.5.985]:Close():util/kaldi-io.cc:515) Pipe add-deltas scp:/root/Documents/MFA/mandarin_example/corpus_data/split1/feats.dict_train.0.scp ark:- | apply-cmvn-sliding --norm-vars=false --center=true --cmn-window=300 ark:- ark:- | had nonzero return status 256

What's that problem means? thanks alot

Tian14267 commented 2 years ago

hi, what is **mandarin.zip** in last pic ? mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/ I don't have this step @

pleasurepants commented 2 years ago

hi, what is **mandarin.zip** in last pic ? mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/ I don't have this step @

It's a pretrained model

Tian14267 commented 2 years ago

hi, what is **mandarin.zip** in last pic ? mfa align ./mandarin_example_cp/ ./dict_new.txt ./mandarin.zip ./output/ I don't have this step @

It's a pretrained model

first, use mfa train ./data dict.txt ./out -o ./out_model to train a model with your data, seconde, use your model in first, to align : mfa align ./data dict.txt out_model.zip ./out_model

mmcauliffe commented 2 years ago

Oh hmm, it looks like you're running this command as root, but you're in a user's document directory. Unfortunately, by default, MFA uses an expanded ~/Documents/MFA as the temporary directory and it seems it got expanded to /root/Documents/MFA. You can try changing the temporary directory with the --temp_directory flag and see if it runs better? But also might be fixed by just not running it as the root user if that's an option.