MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.31k stars 244 forks source link

PronunciationAcousticMismatchError when validating LibriSpeech Example (v2.0.0a15) #280

Closed khiajohnson closed 3 years ago

khiajohnson commented 3 years ago

I ran into this issue with my own data set, and then reproduced it with the LibriSpeech example from the documentation. I'm using MFA v2.0.0a15, installed per the instructions with a conda environment on MacOS big sur 11.2.3 (2020 M1 macbook air if that's relevant). Here's what I ran:

mfa validate ~/Downloads/Librispeech ~/Downloads/librispeech-lexicon.txt english ~/Downloads/aligned_librispeech --ignore_acoustics

And here's the output:

/Users/khia/Documents/MFA/Librispeech/validate.log
INFO - Setting up corpus information...
INFO - Parsing dictionary without pronunciation probabilities without silence probabilities
dictionary phones: {'AA0', 'DH', 'IY2', 'UH2', 'IH0', 'EY1', 'S', 'IY1', 'AA1', 'AW2', 'Z', 'IH1', 'NG', 'Y', 'AO1', 'OY2', 'EH1', 'M', 'AA2', 'CH', 'EY0', 'F', 'EH2', 'R', 'AH0', 'AY2', 'IY0', 'K', 'N', 'W', 'OW0', 'AO0', 'V', 'P', 'OY0', 'OY1', 'UW2', 'AW0', 'L', 'UH1', 'UW0', 'AE0', 'AY1', 'HH', 'ER0', 'AH2', 'AE2', 'ER2', 'AO2', 'EH0', 'D', 'ER1', 'OW1', 'UH0', 'T', 'UW1', 'ZH', 'SH', 'EY2', 'AY0', 'B', 'AW1', 'AE1', 'OW2', 'TH', 'IH2', 'G', 'AH1', 'JH'}
model phones: set()
montreal_forced_aligner.exceptions.PronunciationAcousticMismatchError: There were phones in the dictionary that do not have acoustic models: AA0, AA1, AA2, AE0, AE1, AE2, AH0, AH1, AH2, AO0, AO1, AO2, AW0, AW1, AW2, AY0, AY1, AY2, B, CH, D, DH, EH0, EH1, EH2, ER0, ER1, ER2, EY0, EY1, EY2, F, G, HH, IH0, IH1, IH2, IY0, IY1, IY2, JH, K, L, M, N, NG, OW0, OW1, OW2, OY0, OY1, OY2, P, R, S, SH, T, TH, UH0, UH1, UH2, UW0, UW1, UW2, V, W, Y, Z, ZH

Apart from the quotes, these sets of phones are identical. Any idea what's going on here? Thanks!! 🙏

khiajohnson commented 3 years ago

For what it's worth, I also tried this with the stable 1.0.1 release on my machine, and it ran with no errors.

bin/mfa_align ~/Downloads/Librispeech ~/Downloads/librispeech-lexicon.txt pretrained_models/english.zip ~/Downloads/aligned_librispeech
mmcauliffe commented 3 years ago

I'll check it out! That's weird that the model phones set is empty...

mmcauliffe commented 3 years ago

Hmm, so I can't really replicate it on either my Windows/Linux/Mac version. A couple of thoughts, can you can try rerunning mfa download acoustic english since the english model seems to be having some issues?

khiajohnson commented 3 years ago

Still getting the same error.. weird! I can stick with using 1.0.1 for now. But lmk if you ever want me to test run things on my Mac for you!

mmcauliffe commented 3 years ago

Ok so a couple of follow up questions:

  1. Are you trying to align or validate?
  2. If you're just trying to validate with the dictionary (since you have the --ignore_acoustics flag there), you should be able to run the command without the acoustic model that's causing problems mfa validate ~/Downloads/Librispeech ~/Downloads/librispeech-lexicon.txt --ignore_acoustics
  3. If you are trying to align and running into the same error, that's a little bit weirder, could you attach your english.zip from ~/Documents/MFA/pretrained_models/acoustic/english.zip?
khiajohnson commented 3 years ago
  1. I was attempting to validate, but ultimately want to align, so both are relevant.
  2. Validating with just the dictionary (no acoustic model) worked perfectly.
  3. Attempting to run mfa align ~/Downloads/Librispeech ~/Downloads/librispeech-lexicon.txt ~/Documents/MFA/pretrained_models/acoustic/english.zip ~/Downloads/aligned_librispeech generates the same problem as before:
    All required kaldi binaries were found!
    /Users/khia/Documents/MFA/Librispeech/align.log
    INFO - Setting up corpus information...
    INFO - Number of speakers in corpus: 3, average number of utterances per speaker: 122.33333333333333
    INFO - Parsing dictionary without pronunciation probabilities without silence probabilities
    dictionary phones: {'V', 'OY0', 'EY2', 'UW1', 'TH', 'AE2', 'AH0', 'AO1', 'AH2', 'AW2', 'N', 'Z', 'D', 'CH', 'T', 'JH', 'K', 'OY1', 'IH0', 'AW1', 'AH1', 'EH2', 'AA0', 'UH1', 'EH1', 'EY1', 'AY1', 'ZH', 'AA1', 'UW2', 'R', 'L', 'DH', 'UH0', 'B', 'M', 'IY0', 'NG', 'AY2', 'OW2', 'UH2', 'EH0', 'F', 'AA2', 'IY1', 'OW0', 'ER2', 'IH1', 'AY0', 'AE0', 'AE1', 'AO0', 'ER1', 'P', 'IH2', 'AO2', 'HH', 'S', 'EY0', 'SH', 'UW0', 'W', 'Y', 'OW1', 'ER0', 'IY2', 'G'}
    model phones: set()
    montreal_forced_aligner.exceptions.PronunciationAcousticMismatchError: There were phones in the dictionary that do not have acoustic models: AA0, AA1, AA2, AE0, AE1, AE2, AH0, AH1, AH2, AO0, AO1, AO2, AW1, AW2, AY0, AY1, AY2, B, CH, D, DH, EH0, EH1, EH2, ER0, ER1, ER2, EY0, EY1, EY2, F, G, HH, IH0, IH1, IH2, IY0, IY1, IY2, JH, K, L, M, N, NG, OW0, OW1, OW2, OY0, OY1, P, R, S, SH, T, TH, UH0, UH1, UH2, UW0, UW1, UW2, V, W, Y, Z, ZH

Here's the english.zip

mmcauliffe commented 3 years ago

Can you try maybe deleting ~/Documents/MFA and try running the align command for 2.0.0a15 again, if you don't mind?

khiajohnson commented 3 years ago

Ok I tried that! If I'm in the MFA directory, it gets the same error. If not, then it gives a "model path does not exist error".

mmcauliffe commented 3 years ago

Ok, I release 2.0.0a16, that has some additional logging that might help me track this down. If you could upgrade, rerun the command and then send me the align.log, that should reveal what's going wrong with the metadata in the model.

khiajohnson commented 3 years ago

Ok, here's the log from this run: mfa align ~/Downloads/Librispeech ~/Downloads/librispeech-lexicon.txt ~/Downloads/english.zip ~/Downloads/aligned_librispeech

align.log

mmcauliffe commented 3 years ago

Ok, so I changed the locations for where models get unpacked to be specific to the run, so it shouldn't be getting any interference from the directory that doesn't have the yaml somehow. Could you update to 2.0.0a17 and try it again if you get a chance?

khiajohnson commented 3 years ago

It works now! Thanks!

MalcolmMashig commented 2 years ago

I'm getting the same message whether validating or aligning. Was there a clear solution for this? Please let me know. Thanks! (on Mac Air M1)

I am running on a very small set of audio files -- not sure if that could be an issue.