MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.29k stars 242 forks source link

[BUG] error encountered starting the global MFA database server #640

Closed emilyahn closed 1 year ago

emilyahn commented 1 year ago

Debugging checklist

[ yes; 2.11] Have you updated to latest MFA version? [ yes] Have you tried rerunning the command with the --clean flag?

Describe the issue Upon training an acoustic model (on 38 min of English TIMIT) from scratch, there is an error about the database server. This issue also occurs if I try to run mfa validate.

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? English
    • How many files/speakers? 770 files, 77 speakers (10 files each). It's 1 of the TIMIT TRAIN sub-folders, which contains a sub-folder per speaker.
    • Are you using lab files or TextGrid files for input? lab files
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? No.
    • If it's a custom dictionary, what is the phoneset? TIMIT phoneset. See note below.
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? No.
    • If it's a model you've trained, what data was it trained on? see answer to 1 above.

Log file pg_log_global.txt terminal output ends like this:

INFO Completed training in 1824.9871790409088 seconds!
INFO Saved model to /Users/eahn/work/force_align/data/out/models/ldc_pilot.zip
INFO Done! Everything took 1856.602 seconds INFO Stopping the global MFA database server...
ERROR pg_ctl stdout: waiting for server to shut
down............................................................... failed

ERROR pg_ctl stderr: pg_ctl: server does not shut down
HINT: The "-m fast" option immediately disconnects sessions rather than
waiting for session-initiated disconnection.

Exception ignored in atexit callback: <function stop_server at 0x10b07f6a0> Traceback (most recent call last): File "/Users/eahn/opt/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/utils.py", line 471, in stop_server raise DatabaseError( montreal_forced_aligner.exceptions.DatabaseError: DatabaseError:

There was an error encountered starting the global MFA database server, please see /Users/eahn/Documents/MFA/pg_log_global.txt for more details and/or look at the logged errors above.

Desktop (please complete the following information):

Additional context I'm also doing a thing where each word in the lexicon and text input is a phone, since TIMIT has manually aligned phone labels. I don't know if this is throwing things off. For example, here is a .lab file: sil p iy t s er r iy ih z aa r k ax n v iy n ih t f er k w ih k l ax n ch sil And this is a snippet of the short lexicon file:

aa aa ae ae ao ao aw aw ax ax

Moon-sung-woo commented 1 year ago

same error

mmcauliffe commented 1 year ago

So everything should have worked and it should have exported everything successfully, the DatabaseError at the end is because maintenance workers were running when MFA shut down the postgresql server. I thought I had swallowed the error for successful runs in 2.2.11, but I guess not so I'll take another look. Again, everything with the training looks like it completed fine and ldc_pilot.zip should be exported correctly.

emilyahn commented 1 year ago

Hi Michael, thanks so much for your input! I see that I am able to get the model file ldc_pilot.zip out correctly, and mfa align ... works perfectly. In the meantime, I will ignore the DatabaseErrors.

I'd like to point out though that the information on the train command on this page is misleading, because when I followed this option (mfa train ~/mfa_data/my_corpus ~/mfa_data/my_dictionary.txt ~/mfa_data/new_acoustic_model.zip ~/mfa_data/my_corpus_aligned # Export both trained model and alignments), the alignments did not get deposited into the output folder. It only worked when I used this command: mfa train --output_directory <output_directory> CORPUS_DIRECTORY DICTIONARY_PATH OUTPUT_MODEL_PATH.

Lastly, is it correct that running the mfa align command produces a alignment_analysis.csv in the output folder while mfa train --output_directory <output_directory> ... does not produce an alignment analysis file?

mmcauliffe commented 1 year ago

Oh yes, the syntax for exporting the alignments changed with the updated CLI a bit ago, I'll fix up those docs, and add the alignment analysis export for train.

zqs01 commented 1 year ago

same error