[BUG] MFA 2.2.4 - KaldiProcessingError

jadestorm commented 1 year ago

Debugging checklist

[x] Have you updated to latest MFA version? yes, 2.2.4 [x] Have you tried rerunning the command with the --clean flag? yes

Describe the issue When running a validation, following a set of steps one of our researchers uses for his class, but he (who is far more familiar with all of this) and I (IT staff) are running into the same error and neither of us are sure where to go from here.

For Reproducing your issue TBH I have no idea what the answers to these questions are. -- skipping for now. I CAN link you the lesson: https://phon.wordpress.ncsu.edu/workshops/eng-523-tutorial/part-1/ I did the mfa server init part before this -- and also where I'm at is the very first mfa validate attempt.

Please fill out the following:

Corpus structure
- What language is the corpus in?
- How many files/speakers?
- Are you using lab files or TextGrid files for input?
Dictionary
- Are you using a dictionary from MFA? If so, which one?
- If it's a custom dictionary, what is the phoneset?
Acoustic model
- If you're using an acoustic model, is it one download through MFA? If so, which one?
- If it's a model you've trained, what data was it trained on?

Log file

(aligner) daniel@phon-0:/phon/ENG523/daniel$ mfa validate ../files/jeff_vowelplot english_slaap english_us_arpa --clean
 INFO     Setting up corpus information...
 INFO     Loading corpus from source files...
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/100  [ 0:00:01 < -:--:-- , ? it/s ]
 INFO     Found 1 speaker across 1 file, average number of utterances per
          speaker: 1.0
 INFO     Initializing multiprocessing jobs...
 WARNING  Number of jobs was specified as 3, but due to only having 1 speakers,
          MFA will only use 1 jobs. Use the --single_speaker flag if you would
          like to split utterances across jobs regardless of their speaker.
 INFO     Normalizing text...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:02 < 0:00:00 , ? it/s ]
 INFO     Creating corpus split for feature generation...
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/2  [ 0:00:01 < -:--:-- , ? it/s ]
 INFO     Generating MFCCs...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:00 < 0:00:00 , ? it/s ]
 INFO     Calculating CMVN...
 INFO     Generating final features...
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/1  [ 0:00:01 < -:--:-- , ? it/s ]
 INFO     Creating corpus split with features...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:01 < 0:00:00 , ? it/s ]
 INFO     Corpus
 INFO     1 sound files
 INFO     1 text files
 INFO     1 speakers
 INFO     1 utterances
 INFO     167.874 seconds total duration
 INFO     Sound file read errors
 INFO     There were no issues reading sound files.
 INFO     Feature generation
 INFO     There were no utterances missing features.
 INFO     Files without transcriptions
 INFO     There were no sound files missing transcriptions.
 INFO     Transcriptions without sound files
 INFO     There were no transcription files missing sound files.
 INFO     Dictionary
 INFO     Out of vocabulary words
 INFO     There were no missing words from the dictionary. If you plan on using
          the a model trained on this dataset to align other datasets in the
          future, it is recommended that there be at least some missing words.
 INFO     Training
 INFO     Initializing training for monophone...
 INFO     Compiling training graphs...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:00 < 0:00:00 , ? it/s ]
 INFO     Generating initial alignments...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:00 < 0:00:00 , ? it/s ]
 INFO     Initialization complete!
 INFO     monophone - Iteration 1 of 40
 INFO     Generating alignments...
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/1  [ 0:00:14 < -:--:-- , ? it/s ]
 ERROR    There was an error in the run, please see the log.
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x7f48b068efe0>>
Traceback (most recent call last):
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/command_line/mfa.py", line 97, in history_save_handler
    raise self.exception
  File "/home/daniel/.conda/envs/aligner/bin/mfa", line 11, in <module>
    sys.exit(mfa_cli())
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/rich_click/rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/command_line/validate.py", line 113, in validate_corpus_cli
    validator.validate()
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/validation/corpus_validator.py", line 602, in validate
    self.train()
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 529, in train
    trainer.train()
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/base.py", line 494, in train
    self.train_iteration()
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/base.py", line 466, in train_iteration
    self.align_iteration()
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/base.py", line 439, in align_iteration
    self.align_utterances(training=True)
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/alignment/mixins.py", line 436, in align_utterances
    for utterance, log_likelihood in run_kaldi_function(
  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/utils.py", line 753, in run_kaldi_function
    raise v
montreal_forced_aligner.exceptions.MultiprocessingError: MultiprocessingError:

Job 1 encountered an error:
Traceback (most recent call last):

  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 85, in run
    yield from self._run()

  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/alignment/multiprocessing.py", line 955, in _run
    self.check_call(align_proc)

  File "/home/daniel/.conda/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 112, in check_call
    raise KaldiProcessingError([self.log_path])

montreal_forced_aligner.exceptions.KaldiProcessingError: KaldiProcessingError:

There were 1 job(s) with errors when running Kaldi binaries.
See the log files below for more information.
/home/daniel/Documents/MFA/jeff_vowelplot/monophone/log/align.1.1.log
(aligner) daniel@phon-0:/phon/ENG523/daniel$ cat /home/daniel/Documents/MFA/jeff_vowelplot/monophone/log/align.1.1.log
/home/daniel/.conda/envs/aligner/bin/gmm-boost-silence --boost=1.25 1 /home/daniel/Documents/MFA/jeff_vowelplot/monophone/1.mdl -
LOG (gmm-boost-silence[5.5.1016]:main():gmmbin/gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1.25
/home/daniel/.conda/envs/aligner/bin/gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=6 --retry-beam=40 --careful=false --write-per-frame-acoustic-loglikes=ark:/home/daniel/Documents/MFA/jeff_vowelplot/monophone/like.1.1.ark - ark,s,cs:/home/daniel/Documents/MFA/jeff_vowelplot/monophone/fsts.1.1.ark 'ark,s,cs:add-deltas scp,s,cs:"/home/daniel/Documents/MFA/jeff_vowelplot/jeff_vowelplot/split3/feats.1.1.scp" ark:- |' ark:/home/daniel/Documents/MFA/jeff_vowelplot/monophone/ali.1.1.ark ark,t:-
LOG (gmm-boost-silence[5.5.1016]:main():gmmbin/gmm-boost-silence.cc:103) Wrote model to -
add-deltas scp,s,cs:/home/daniel/Documents/MFA/jeff_vowelplot/jeff_vowelplot/split3/feats.1.1.scp ark:-
LOG (gmm-align-compiled[5.5.1016]:main():gmmbin/gmm-align-compiled.cc:127) 1-1
WARNING (gmm-align-compiled[5.5.1016]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:617) Retrying utterance 1-1 with beam 40
WARNING (gmm-align-compiled[5.5.1016]:AlignUtteranceWrapper():decoder/decoder-wrappers.cc:626) Did not successfully decode file 1-1, len = 16785
LOG (gmm-align-compiled[5.5.1016]:main():gmmbin/gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan over 0 frames.
LOG (gmm-align-compiled[5.5.1016]:main():gmmbin/gmm-align-compiled.cc:137) Retried 1 out of 1 utterances.
LOG (gmm-align-compiled[5.5.1016]:main():gmmbin/gmm-align-compiled.cc:139) Done 0, errors on 1
(aligner) daniel@phon-0:/phon/ENG523/daniel$

Desktop (please complete the following information):

OS: Linux
Version: Ubuntu 20.04
Any other details about the setup: nothing special

Additional context Note: we also had problems with the PostgreSQL db setup that we discussed in a different issue. After upgrading to 2.2.4, and after deleting ~/Documents/MFA, I was able to launch the server with mfa server init. I doubt that's relevant but I thought I would mention it. I'm not entirely sure why the auto-start/stop is not working.

jadestorm commented 1 year ago

Just FYI I am out tomorrow so I won't be able to reply to anything until Monday. =) But if there's more debugging you'd like me to do I can certainly do that.

jeffmielke commented 1 year ago

I can fill in these details:

Corpus structure
    What language is the corpus in? English
    How many files/speakers? 1/1
    Are you using lab files or TextGrid files for input? TextGrid
Dictionary
    Are you using a dictionary from MFA? If so, which one? no
    If it's a custom dictionary, what is the phoneset? arpabet, based on CMU dictionary
Acoustic model
    If you're using an acoustic model, is it one download through MFA? If so, which one? english_us_arpa
    If it's a model you've trained, what data was it trained on? n/a

mmcauliffe commented 1 year ago

Oh, deleting my previous comment, thought the issue was something else.

Right, so validate by default attempts to train a model on the corpus specified, so the acoustic model path is an optional argument specified via: mfa validate ... --acoustic_model_path english_us_arpa

jeffmielke commented 1 year ago

The file is 167 seconds long. I've confirmed that in 2.2.6 I can align a larger set of recordings but this individual file intermittently gives the same error. I have aligned this file with previous MFA versions. It's used in a tutorial for teaching our students how to use P2FA and MFA. With --beam 1000 it took 2039 seconds to validate but ultimately produced the same error on alignment. Since then it aligned successfully without a larger beam and I don't think anything was different from the previous times it gave the error.

It would be great to be able to align single recordings. I have a web interface for students to plot their own vowel spaces based on a reading passage, and I would like to switch the forced alignment part of it from P2FA to MFA. When I have previously aligned these short wav files with both aligners the MFA alignments have been a lot better than the P2FA ones.

thanks Jeff

Jeff Mielke Professor Linguistics program Department of English North Carolina State University

On Sat, Mar 18, 2023 at 8:56 PM Michael McAuliffe @.***> wrote:

How long is the file? You can try bumping the beam size higher to see if it aligns mfa align ... --beam 1000, but that's usually the solution for longer files as long as the transcripts are accurate.

If it's possible, doing a larger batch than a single file is usually much more accurate (as feature transforms like CMVN and speaker adaptation benefit immensely from them). It does seem like more people are using it for single file alignment, so I'll try to figure out a better solution for the one-off files soon.

— Reply to this email directly, view it on GitHub https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/587#issuecomment-1475054047, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3Q3BKHVGZW2KGFAAWE73DW4ZKVBANCNFSM6AAAAAAVVTOSP4 . You are receiving this because you commented.Message ID: @.*** com>

mmcauliffe commented 1 year ago

Hi Jeff, I realized after posting that the issue in this particular case is not (necessarily) related to the length of the audio file, but rather due to a change in the argument specification for the mfa validate command. When using argparse previously, I had acoustic_model_path as an optional argument, but the new CLI uses click, which takes the opinionated route that all arguments should be non-optional, and optional arguments should be explicitly flagged via options. So the new command would be something like mfa validate ... --acoustic_model_path english_us_arpa.

Still possible that 167 second files might not align, but it should with a beam of 1000, but the error here is caused by MFA trying to do a test monophone training, which crashes when there's no alignments generated on the second iteration.

jeffmielke commented 1 year ago

Hi Michael.

Thanks. I'm confused about this. mfa validate works for me when I add --acoustic_model_path. But itt seems like the dictionary and the input directory don't need to be explicitly flagged, and when I add --dictionary_path it doesn't work. And when I try to align the same way I validated (explicitly flagging the acoustic model):

mfa align ../files/jeff_vowelplot english_us_arpa --acoustic_model_path english_us_arpa jeff_vowelplot_output/

I get this error message:

╭─ Error ─────────────────────────────────────────────────────────────────────╮ │ Invalid value for 'ACOUSTIC_MODEL_PATH': PretrainedModelNotFoundError: │ │ │ │ Could not find a model named "--acoustic_model_path" for acoustic. │ │ Available: english_us_arpa. │ ╰─────────────────────────────────────────────────────────────────────────────╯

When I take --acoustic_model_path back out, it aligns fine. This is 2.2.6.

Jeff

On Wed, Mar 22, 2023 at 10:03 PM Michael McAuliffe @.***> wrote:

Hi Jeff, I realized after posting that the issue in this particular case is not (necessarily) related to the length of the audio file, but rather due to a change in the argument specification for the mfa validate command. When using argparse previously, I had acoustic_model_path as an optional argument, but the new CLI uses click https://click.palletsprojects.com/en/8.1.x/, which takes the opinionated route that all arguments should be non-optional, and optional arguments should be explicitly flagged via options. So the new command would be something like mfa validate ... --acoustic_model_path english_us_arpa.

Still possible that 167 second files might not align, but it should with a beam of 1000, but the error here is caused by MFA trying to do a test monophone training, which crashes when there's no alignments generated on the second iteration.

— Reply to this email directly, view it on GitHub https://github.com/MontrealCorpusTools/Montreal-Forced-Aligner/issues/587#issuecomment-1480478774, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH3Q3BOBSWI7F77LINXNQZTW5OVP5ANCNFSM6AAAAAAVVTOSP4 . You are receiving this because you commented.Message ID: @.*** com>

MontrealCorpusTools / Montreal-Forced-Aligner

[BUG] MFA 2.2.4 - KaldiProcessingError #587