MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.27k stars 242 forks source link

[BUG] mfa validate succeeds but mfa align gives ZeroDivisionError #707

Closed jnc-nj closed 5 months ago

jnc-nj commented 10 months ago

Debugging checklist

[x] Have you updated to latest MFA version? [x] Have you tried rerunning the command with the --clean flag?

Describe the issue

mfa validate succeeds, but mfa align gives a

File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forced_aligner/multiprocessing/alignment.py", line 634, in compile_information
    log_like = avg_like_sum / avg_like_frames
ZeroDivisionError: division by zero

in addition to a WARNING - No files were aligned, this likely indicates serious problems with the aligner.

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? Chinese
    • How many files/speakers? 1 speaker, 276 files
    • Are you using lab files or TextGrid files for input? lab files
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? no
    • If it's a custom dictionary, what is the phoneset? the dictionary file is as follows: opencpop-extension.txt
  3. Acoustic model

    • If you're using an acoustic model, is it one download through MFA? If so, which one? no
    • If it's a model you've trained, what data was it trained on? the opencpop dataset, a mandarin singing voice corpus Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA). no file was generated in ~/Documents/MFA, I assume this is because the run ended before getting to the logging step? the cmd output is as follows:
      
      (mfa) root@aria2:/workspace/MakeDiffSinger/acoustic_forced_alignment# mfa validate assets dictionaries/opencpop-extension.txt data/mfa-opencpop-extension_r.zip 
      oov_phone spn
      optional_silence_phone sil
      silence_probability 0.31889686191066413
      multilingual_ipa False
      INFO - Parsing dictionary "opencpop-extension" without pronunciation probabilities without silence
              probabilities
      INFO - Setting up corpus information...
      INFO - Creating dictionary information...
      INFO - Setting up training data...
      INFO - Generating base features (mfcc)...
      INFO - Calculating CMVN...
      INFO - Setting up training data...
      INFO - Setting up training data...
      INFO -      =========================================Corpus=========================================     276
              sound files     276 sound files with .lab transcription files     0 sound files with TextGrids
              transcription files     0 additional sound files ignored (see below)     1 speakers     276 utterances
              2234.889 seconds total duration      DICTIONARY     ----------     There were no missing words from the
              dictionary. If you plan on using the a model trained on this dataset to align other datasets in the
              future, it is recommended that there be at least some missing words.      SOUND FILE READ ERRORS
              ----------------------     There were no sound files that could not be read.      FEATURE CALCULATION
              -------------------     There were no utterances missing features.      FILES WITHOUT TRANSCRIPTIONS
              ----------------------------     There were no sound files missing transcriptions.      TRANSCRIPTIONS
              WITHOUT FILES     --------------------     There were 1 transcription files missing sound files. Please
              see /root/Documents/MFA/assets/corpus_data/transcriptions_missing_sound_files.csv for a list.
              TEXTGRID READ ERRORS     --------------------     There were no issues reading TextGrids.
              UNREADABLE TEXT FILES     --------------------     There were no issues reading text files.
      INFO - Initializing training for mono...
      INFO - Initialization complete!
      100%|███████████████████████████████████████████████████████████| 40/40 [01:29<00:00,  2.23s/it]
      INFO - Training complete!
      INFO - Generating alignments using mono models for the whole corpus...

    =======================================Alignment======================================== All 276 utterances were successfully aligned!

INFO - All done! (mfa) root@aria2:/workspace/MakeDiffSinger/acoustic_forced_alignment# cat /root/Documents/MFA/assets/corpus_data/transcriptions_missing_sound_files.csv None (mfa) root@aria2:/workspace/MakeDiffSinger/acoustic_forced_alignment# mfa align assets dictionaries/opencpop-extension.txt data/mfa-opencpop-extension_r.zip textgrids --beam 100 --clean --overwrite Cleaning old directory! oov_phone spn optional_silence_phone sil silence_probability 0.31889686191066413 multilingual_ipa False INFO - Setting up corpus information... INFO - Number of speakers in corpus: 1, average number of utterances per speaker: 276.0 INFO - Parsing dictionary "opencpop-extension" without pronunciation probabilities without silence probabilities INFO - Creating dictionary information... INFO - Setting up training data... INFO - Generating base features (mfcc)... INFO - Calculating CMVN... INFO - Setting up training data... INFO - Setting up training data... INFO - Done with setup! INFO - Performing first-pass alignment... WARNING - No files were aligned, this likely indicates serious problems with the aligner. Traceback (most recent call last): File "/workspace/miniconda/envs/mfa/bin/mfa", line 11, in sys.exit(main()) File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forced_aligner/command_line/mfa.py", line 793, in main run_align_corpus(args, unknown) File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forced_aligner/command_line/align.py", line 236, in run_align_corpus align_corpus(args, unknown_args) File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forced_aligner/command_line/align.py", line 171, in align_corpus a.align() File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forcedaligner/aligner/base.py", line 253, in align , average_log_like = compile_information(self) File "/workspace/miniconda/envs/mfa/lib/python3.8/site-packages/montreal_forced_aligner/multiprocessing/alignment.py", line 634, in compile_information log_like = avg_like_sum / avg_like_frames ZeroDivisionError: division by zero



**Desktop (please complete the following information):**
 - OS: [e.g. Windows, OSX, Linux]
Ubuntu
 - Version [e.g. MacOSX 10.15, Ubuntu 20.04, Windows 10, etc] 
22.04.2 LTS

 - Any other details about the setup (Cloud, Docker, etc)
Dockerized container with miniconda setup on local server

**Additional context**
Add any other context about the problem here.
mmcauliffe commented 9 months ago

What version of MFA is this (what's the output of mfa version)? It looks like an early-ish version of 2.0, I'd recommend updating to one of the stable versions here: https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html#installing-older-versions-of-mfa.