MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.35k stars 249 forks source link

aligning using adapted acoustic model #658

Open echodroff opened 1 year ago

echodroff commented 1 year ago
          Hm, I'm running into a similar issue with another Evenki-adapted English model, and adding --position_dependent_phones false no longer seems to be doing the trick. :/ 

mfa version 2.2.11

The model is here (base model is english_mfa 2.0): https://drive.google.com/file/d/1o55zvIn3kbUQK4jKPFFSOsIZcs44YfLK/view?usp=sharing

Dictionary is the same as above

The code: mfa align --clean /Users/eleanor/evenki/mfainput/short/tst pronDictIntermediateEvenkiEnglish evenki_adapted_english_short /Users/eleanor/evenki/train_test/short/train_full_adapted/tstoutput --position_dependent_phones false

It stops at the alignment stage.

The log file: compile_train_graphs.3.log

Originally posted by @echodroff in https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/641#issuecomment-1603200094

mmcauliffe commented 1 year ago

Can you try updating to the latest version? I should have fixed this in 2.2.12

echodroff commented 1 year ago

I had retried with 2.2.12 and now 2.2.14 (both re-adapting and re-aligning), and I'm still getting an error. Version 2.2.14 is giving me this log file:

align.3.log

Model trained with 2.2.14 is here: https://drive.google.com/file/d/1Jl6HBIsOydb03zRkxJ-DQ6gBMMOIZiMh/view?usp=sharing

Code to adapt (this seems to work): mfa adapt --clean ~/evenki/orig/mfainput/short/train pronDictIntermediateEvenkiEnglish english_mfa ~/Documents/MFA/pretrained_models/acoustic/evenki_adapted_english_short.zip

Code to align (this does not): mfa align --clean ~/evenki/orig/mfainput/short/tst pronDictIntermediateEvenkiEnglish evenki_adapted_english_short ~/evenki/train_test/short/train_full_adapted/tstoutput --position_dependent_phones false

mmcauliffe commented 1 year ago

Hmm, ok, I'm having trouble replicating this on 2.2.15 (mostly just fixes some issues with dependencies breaking, so shouldn't affect this). Can you try deleting ~/Documents/MFA/pretrained_models/acoustic/evenki_adapted_english_short.zip in case it isn't overwriting properly, and rerunning the commands (you can drop the --position_dependent_phones false since that workaround is no longer necessary). Very confused since it's properly adapting and aligning on my machine...

echodroff commented 1 year ago

Yeah, this is very strange. I'm now on v2.2.15, and have tried re-adapting on this version, and the align function with that model still gets the same error. I've tried removing and just moving the acoustic model, changing the test data, and also switching computers (from Apple M1 Ventura 13.4 to Apple M2 Monterey 12.5). I keep getting the same log file errors: WARNING (gmm-align-compiled[5.5.1068]:Close():util/kaldi-io.cc:515) Pipe splice-feats --left-context=3 --right-context=3 scp,s,cs:"/Users/eleanor/Documents/MFA/tst/tst/split3/feats.1.1.scp" ark:- | transform-feats "/Users/eleanor/Documents/MFA/tst/alignment/lda.mat" ark:- ark:- | had nonzero return status 256

(with a bunch of No features for utterance X before that and Transform matrix for utterance X has bad dimension 40x112 versus feat dim 105 before that)

align.3.log

It seems like something might be messed up with the adapted model, but you were able to get the model to align data on your computer?

fish0510 commented 5 months ago

I also encountered the same problem at version=2.2.17 while adapting mandarin_mfa v2.0 with m4singer dataset. Both aligning with pretrain mandarin_mfa and adapting the mandarin_mfa model worked successfully. Did it have a solution? align.1.log