MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.3k stars 243 forks source link

[BUG] MFA 2.2.7 compile-train-graphs got error when trained a new acoustic model #609

Closed sp1007 closed 1 year ago

sp1007 commented 1 year ago

I was training a new acoustic model with MFA 2.2.7

[command] mfa train --clean -t ./temp --output_directory ./info_aligned --phone_set UNKNOWN ./infore_16k_denoised lexicon.txt infore_mfa.zip

Describe the issue A clear and concise description of what the bug is. After 40 steps of training, i got error There were 1 job(s) with errors when running Kaldi binaries. Read log at temp[/infore_16k_denoised/monophone_ali/log/compile_train_graphs.1.log], i found that it seems compile-train-graphs used wrong values at parameters <tree-in> and <model-in>.

parameters in error log file: \<tree-in> : temp/infore_16k_denoised/monophone_ali/tree \<model-in>: temp/infore_16k_denoised/monophone_ali/final.mdl

then i found above files at folder: temp/infore_16k_denoised/monophone_ali/acoustic_model_acoustic/acoustic_model/

For Reproducing your issue

Log file /Users/sp/opt/miniconda3/envs/mfa/bin/compile-train-graphs --read-disambig-syms=temp/infore_16k_denoised/dictionary/phones/disambiguation_symbols.int temp/infore_16k_denoised/monophone_ali/tree temp/infore_16k_denoised/monophone_ali/final.mdl temp/infore_16k_denoised/dictionary/1_lexicon/L.fst ark,s,cs:temp/infore_16k_denoised/infore_16k_denoised/split3/text.1.1.int.scp ark:temp/infore_16k_denoised/monophone_ali/fsts.1.1.ark ERROR (compile-train-graphs[5.5.1016]:Input():util/kaldi-io.cc:756) Error opening input stream temp/infore_16k_denoised/monophone_ali/tree kaldi::KaldiFatalError

Desktop (please complete the following information):

Additional context Add any other context about the problem here. compile_train_graphs.1.log

mmcauliffe commented 1 year ago

Hmm, I haven't seen this before. Can you check the other logs in the mono directory to see if some stage of the monophone training was causing issues? The tree file exists and it's running into an error loading it, but that should have happened when initializing the monophone training since it's just copying over the same tree file from mono to mono_ali.

sp1007 commented 1 year ago

Here is full trace in console: ` ERROR There was an error in the run, please see the log.
ERROR There was no database found at temp/pg_mfa_global.
Exception ignored in atexit callback: <function stop_server at 0x15af33eb0> Traceback (most recent call last): File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/utils.py", line 456, in stop_server sys.exit(1) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/mfa.py", line 69, in exit self._orig_exit(code) SystemExit: 1 Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x10794cc10>> Traceback (most recent call last): File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/mfa.py", line 102, in history_save_handler raise self.exception File "/Users/sp/opt/miniconda3/envs/mfa/bin/mfa", line 8, in sys.exit(mfa_cli()) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(*args, kwargs) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/rich_click/rich_group.py", line 21, in main rv = super().main(args, standalone_mode=False, kwargs) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1657, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, kwargs) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py", line 111, in train_acoustic_model_cli trainer.train() File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 523, in train self.align() File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 761, in align self.compile_train_graphs() File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/alignment/mixins.py", line 340, in compile_train_graphs raise v montreal_forced_aligner.exceptions.MultiprocessingError: MultiprocessingError:

Job 1 encountered an error: Traceback (most recent call last):

File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 89, in run yield from self._run()

File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/alignment/multiprocessing.py", line 729, in _run self.check_call(proc)

File "/Users/sp/opt/miniconda3/envs/mfa/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 116, in check_call raise KaldiProcessingError([self.log_path])

montreal_forced_aligner.exceptions.KaldiProcessingError: KaldiProcessingError:

There were 1 job(s) with errors when running Kaldi binaries. See the log files below for more information. temp/info/monophone_ali/log/compile_train_graphs.1.log `

sp1007 commented 1 year ago

Hi @mmcauliffe , i found that, if i remove option "-t ./temp", everything is ok. `mfa train --clean --output_directory ./info_aligned --phone_set UNKNOWN ./infore_16k_denoised lexicon.txt infore_mfa.zip` ps: i got this error Windows 11 too, but removing option "-t ./temp" cannot fix the error.

walker-hyf commented 1 year ago

Hi @mmcauliffe , i found that, if i remove option "-t ./temp", everything is ok. mfa train --clean --output_directory ./info_aligned --phone_set UNKNOWN ./infore_16k_denoised lexicon.txt infore_mfa.zip ps: i got this error Windows 11 too, but removing option "-t ./temp" cannot fix the error.

How did you solve this problem in the end? Looking forward to your reply😊

sp1007 commented 1 year ago

Finally, i used MFA version 2.0.6 and felt good 😄

walker-hyf commented 1 year ago

Thank you very much 😁

starmoon-1134 commented 1 year ago

这可能是相对路径的问题,我用MFA version 2.2.14,centos7,使用绝对路径就不会报错了

--temporary_directory /home/deploy/dataset/cv-corpus-2.0-zh-CN/temp/mfa_temp