MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.27k stars 242 forks source link

[BUG] mfa align with --fine_tune #693

Closed styx0r closed 10 months ago

styx0r commented 11 months ago

Thanks for this awesome project. I encountered an error when trying to use align in combination with --fine_tune. I hope you can provide a fix. Thank you.

Debugging checklist

[x] Have you updated to latest MFA version? [x] Have you tried rerunning the command with the --clean flag?

Describe the issue std::bad_alloc error when I run mfa align with the --fine_tune option.

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? german
    • How many files/speakers? 2
    • Are you using lab files or TextGrid files for input? TextGrid
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? german_mfa
    • If it's a custom dictionary, what is the phoneset? /
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? german_mfa
    • If it's a model you've trained, what data was it trained on? /

Log file INFO Setting up corpus information...
INFO Loading corpus from source files...
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 164/100 [ 0:00:00 < 0:00:00 , ? it/s ] INFO Found 2 speakers across 316 files, average number of utterances per
speaker: 158.0
INFO Initializing multiprocessing jobs...
WARNING Number of jobs was specified as 3, but due to only having 2 speakers, MFA will only use 2 jobs. Use the --single_speaker flag if you would
like to split utterances across jobs regardless of their speaker.
INFO Normalizing text...
0% 1/316 [ 0:00:02 < -:--:-- , ? it/s ] INFO Generating MFCCs...
99% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 312/316 [ 0:00:03 < 0:00:01 , 276 it/s ] INFO Calculating CMVN...
INFO Generating final features...
0% 0/316 [ 0:00:01 < -:--:-- , ? it/s ] INFO Creating corpus split...
0% 0/316 [ 0:00:01 < -:--:-- , ? it/s ] INFO Compiling training graphs...
50% ━━━━━━━━━━━━━━━━━ 158/316 [ 0:00:01 < -:--:-- , ? it/s ] INFO Performing first-pass alignment...
INFO Generating alignments...
97% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 306/316 [ 0:00:01 < 0:00:01 , 999 it/s ] INFO Calculating fMLLR for speaker adaptation...
50% ━━━━━━━━━━━━━━━━━━━ 1/2 [ 0:00:01 < -:--:-- , ? it/s ] INFO Performing second-pass alignment...
INFO Generating alignments...
98% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 310/316 [ 0:00:01 < 0:00:01 , 1,021 it/s ] INFO Collecting phone and word alignments from alignment lattices...
0% 1/316 [ 0:00:03 < -:--:-- , ? it/s ] INFO Fine tuning alignments...
0% 0/316 [ 0:00:01 < -:--:-- , ? it/s ] ERROR There was an error in the run, please see the log.
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x7fffffa93490>> Traceback (most recent call last): File "/env/lib/python3.11/site-packages/montreal_forced_aligner/command_line/mfa.py", line 99, in history_save_handler raise self.exception File "/env/bin/mfa", line 8, in sys.exit(mfa_cli()) ^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/core.py", line 1157, in call return self.main(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/rich_click/rich_group.py", line 21, in main rv = super().main(args, standalone_mode=False, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/core.py", line 783, in invoke return __callback(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/env/lib/python3.11/site-packages/montreal_forced_aligner/command_line/align.py", line 111, in align_corpus_cli aligner.align() File "/env/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 451, in align super().align() File "/env/lib/python3.11/site-packages/montreal_forced_aligner/alignment/base.py", line 379, in align self.fine_tune_alignments() File "/env/lib/python3.11/site-packages/montreal_forced_aligner/alignment/base.py", line 999, in fine_tune_alignments for result in run_kaldi_function(FineTuneFunction, arguments, pbar.update): File "/env/lib/python3.11/site-packages/montreal_forced_aligner/utils.py", line 648, in run_kaldi_function raise v File "/env/lib/python3.11/site-packages/montreal_forced_aligner/utils.py", line 538, in run self.function.run() File "/env/lib/python3.11/site-packages/montreal_forced_aligner/abc.py", line 86, in run raise MultiprocessingError(self.job_name, error_text) montreal_forced_aligner.exceptions.MultiprocessingError: MultiprocessingError:

Job 2 encountered an error: Traceback (most recent call last):

File "/env/lib/python3.11/site-packages/montreal_forced_aligner/abc.py", line 82, in run self._run()

File "/env/lib/python3.11/site-packages/montreal_forced_aligner/alignment/multiprocessing.py", line 886, in _run feats = FloatMatrix(sub_matrix) ^^^^^^^^^^^^^^^^^^^^^^^

MemoryError: std::bad_alloc

Desktop (please complete the following information):

tjmahr commented 11 months ago

I get the same error using fine_tune. My set up is the MFA 3.0.0a5 on Windows 11 running mfa align [speaker dirs] [dict file] english_us_arpa [output dir] --fine_tune --clean --include_original_text --beam 1000.

starmoon-1134 commented 10 months ago

I trained an acoustic model using version 2.2.12 of Montreal Forced Aligner (MFA). Then I tried to use version 3.0.0a5 of MFA to do fine-tuning alignment, but got an error "MemoryError: std::bad_alloc". After switching to use version 2.2.14 of MFA instead, the fine-tuning alignment was able to run successfully without errors.