MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.29k stars 242 forks source link

[BUG] --speaker_characters results in KeyError #669

Closed thealk closed 1 year ago

thealk commented 1 year ago

Debugging checklist

[ X] Have you updated to latest MFA version? [ X] Have you tried rerunning the command with the --clean flag?

Describe the issue When running mfa align with the --speaker_characters (-s) flag for speaker adaptation, alignment fails. The KeyError is the id for a single speaker. This error also occurs for mfa validate.

(aligner) thea MFA % mfa align -s 4 --clean /Users/thea/Documents/0_test english_us_arpa english_us_arpa /Users/thea/Documents/1_test_out

File name formats are, for example: oc01_[bla...].wav/.TextGrid, where the first 4 characters denote the speaker ID.

Command: mfa align --speaker_characters 4 --clean /Users/thea/Documents/0_test english_us_arpa english_us_arpa /Users/thea/Documents/1_test_out

Output:

 INFO     Setting up corpus information...                                      
 INFO     Loading corpus from source files...                                   
   1% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/100  [ 0:00:02 < -:--:-- , ? it/s ]
 INFO     Stopped parsing early (0.0847590000000018 seconds)                    
 ERROR    There was an error in the run, please see the log.                    
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x18bfbbbd0>>
Traceback (most recent call last):
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/mfa.py", line 98, in history_save_handler
    raise self.exception
  File "/Users/thea/miniconda3/envs/aligner/bin/mfa", line 10, in <module>
    sys.exit(mfa_cli())
             ^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/rich_click/rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/command_line/align.py", line 113, in align_corpus_cli
    aligner.align()
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 412, in align
    self.setup()
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/alignment/pretrained.py", line 205, in setup
    self.load_corpus()
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1209, in load_corpus
    self._load_corpus()
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1288, in _load_corpus
    self._load_corpus_from_source_mp()
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/acoustic_corpus.py", line 1023, in _load_corpus_from_source_mp
    import_data.add_objects(self.generate_import_objects(file))
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/thea/miniconda3/envs/aligner/lib/python3.11/site-packages/montreal_forced_aligner/corpus/base.py", line 1018, in generate_import_objects
    "speaker_id": self._speaker_ids[u.speaker_name],
                  ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^
KeyError: 'oc01'

For Reproducing your issue Please fill out the following:

  1. Corpus structure
    • What language is the corpus in? English
    • How many files/speakers? Test case has 2 speakers and 5 files each (10 files in total)
    • Are you using lab files or TextGrid files for input? TextGrids
  2. Dictionary
    • Are you using a dictionary from MFA? If so, which one? Yes, english_us_arpa
    • If it's a custom dictionary, what is the phoneset?
  3. Acoustic model
    • If you're using an acoustic model, is it one download through MFA? If so, which one? Yes, english_us_arpa
    • If it's a model you've trained, what data was it trained on?

Log file Please attach the log file for the run that encountered an error (by default these will be stored in ~/Documents/MFA). 0_test.log

Desktop (please complete the following information):

Additional context

thealk commented 1 year ago

Some additional things I've tried that result in the same errors now also include:

thealk commented 1 year ago

Another update: Another issue apparently was that speaker adaptation was just not detecting my distinct speakers correctly at all (or at least not in a way I understand). That is, when I ran mfa align (without specifying -s), it would detect 2 speakers consistently, no matter if there were 10 different filenames, or if the files were in distinct directories.

Revising my previous updated because the original solution I came up with (concatenate files and assign one tier per speaker, named with speaker ID) was more complicated than necessary. I erroneously thought my simpler solution, below, wasn't working (just rename your tiers in your original transcript textgrids to match the speaker IDs), but I must have been doing something wrong in my initial tests because now it's working.

The ULTIMATE solution, which I am now happy with, is to make sure your transcription tier in EACH FILE has a name corresponding to the speaker. For a larger corpus with many files per speaker, this means every file for a given speaker has the same tier name, but no two speakers have the same tier name. For a smaller corpus with one file per speaker, each file just contains a tier name that matches the file name.

For anyone working in Praat, I wrote a tiny script to rename tiers according to speaker_chars in the filenames: https://github.com/thealk/PraatScripts/blob/master/mfa_prep/rename_tiers_to_speaker.praat

I am leaving this issue open because the original problem, namely that the -s flag consistently results in an error, is still an unresolved issue. However, this workaround is easy enough to implement.