MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.33k stars 246 forks source link

[ERROR] Getting MultiprocessingError, KaldiProcessingError when attempting to align a 2-speaker audio file #590

Closed hailthedawn closed 1 year ago

hailthedawn commented 1 year ago

I tried to run the sample Colab notebook (https://gist.github.com/NTT123/12264d15afad861cb897f7a20a01762e) locally (as a .py file). It's working fine when I use it for the provided ljspeech data. However, when I try it with my own data (a 16khz wav file and its .txt transcript), I get the following error:

(The audio file is around 4 minutes long, if that's relevant.)

$ mfa align -t ./temp -j 2 ./temp_mfa modified_librispeech-lexicon.txt ./english.zip ./ljs_alignedcapstone
 WARNING  The previous run had a different configuration than the current, which may cause issues. Please see the log for details or use --clean flag if issues 
          are encountered.
 WARNING  The previous run had a different configuration than the current, which may cause issues. Please see the log for details or use --clean flag if issues 
          are encountered.
 INFO     Setting up corpus information...
 INFO     Found 1 speaker across 1 file, average number of utterances per speaker: 1.0
 INFO     Initializing multiprocessing jobs...
 WARNING  Number of jobs was specified as 2, but due to only having 1 speakers, MFA will only use 1 jobs. Use the --single_speaker flag if you would like to    
          split utterances across jobs regardless of their speaker.
 INFO     Text already normalized.
 INFO     Creating corpus split with features...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:04 < 0:00:00 , ? it/s ] INFO     Features already generated.
 INFO     Compiling training graphs...
 100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1/1  [ 0:00:04 < 0:00:00 , ? it/s ] INFO     Performing first-pass alignment...
 INFO     Generating alignments...
   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/1  [ 0:00:14 < -:--:-- , ? it/s ] ERROR    There was an error in the run, please see the log.
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x0000016DD15D2410>>
Traceback (most recent call last):
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\mfa.py", line 97, in history_save_handler
    raise self.exception
  File "C:\Users\Ketaki\anaconda3\envs\aligner\Scripts\mfa-script.py", line 10, in <module>
    sys.exit(mfa_cli())
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\rich_click\rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 113, in align_corpus_cli
    aligner.align()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\pretrained.py", line 412, in align
    super().align()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\base.py", line 345, in align
    self.align_utterances()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\mixins.py", line 436, in align_utterances
    for utterance, log_likelihood in run_kaldi_function(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\utils.py", line 753, in run_kaldi_function
    raise v
montreal_forced_aligner.exceptions.MultiprocessingError: MultiprocessingError:

Job 1 encountered an error:
Traceback (most recent call last):

  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\abc.py", line 85, in run
    yield from self._run()

  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\multiprocessing.py", line 955, in _run
    self.check_call(align_proc)

  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\abc.py", line 112, in check_call
    raise KaldiProcessingError([self.log_path])

montreal_forced_aligner.exceptions.KaldiProcessingError: KaldiProcessingError:

There were 1 job(s) with errors when running Kaldi binaries.
See the log files below for more information.
temp\temp_mfa\alignment\log\align.1.log`
hailthedawn commented 1 year ago

My align.1.log file:

'C:\Users\Ketaki\anaconda3\envs\aligner\Library\bin\gmm-boost-silence.EXE' --boost=1.0 1 'temp\temp_mfa\alignment\final.mdl' - 
'C:\Users\Ketaki\anaconda3\envs\aligner\Library\bin\gmm-align-compiled.EXE' --transition-scale=1.0 --acoustic-scale=0.083333 --self-loop-scale=0.1 --beam=10 --retry-beam=40 --careful=false '--write-per-frame-acoustic-loglikes=ark:temp\temp_mfa\alignment\like.1.1.ark' - 'ark,s,cs:temp\temp_mfa\alignment\fsts.1.1.ark' 'ark,s,cs:add-deltas scp,s,cs:"temp\temp_mfa\temp_mfa\split2\feats.1.1.scp" ark:- |' 'ark:temp\temp_mfa\alignment\ali.1.1.ark' ark,t:- 
WARNING (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
LOG (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
LOG (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:103) Wrote model to -
add-deltas 'scp,s,cs:temp\temp_mfa\temp_mfa\split2\feats.1.1.scp' ark:- 
LOG (gmm-align-compiled.EXE[5.5.1016]:main():gmmbin\gmm-align-compiled.cc:127) 1-1
WARNING (gmm-align-compiled.EXE[5.5.1016]:kaldi::AlignUtteranceWrapper():decoder\decoder-wrappers.cc:617) Retrying utterance 1-1 with beam 40
WARNING (gmm-align-compiled.EXE[5.5.1016]:kaldi::AlignUtteranceWrapper():decoder\decoder-wrappers.cc:626) Did not successfully decode file 1-1, len = 25547
LOG (gmm-align-compiled.EXE[5.5.1016]:main():gmmbin\gmm-align-compiled.cc:135) Overall log-likelihood per frame is -nan(ind) over 0 frames.
LOG (gmm-align-compiled.EXE[5.5.1016]:main():gmmbin\gmm-align-compiled.cc:137) Retried 1 out of 1 utterances.
LOG (gmm-align-compiled.EXE[5.5.1016]:main():gmmbin\gmm-align-compiled.cc:139) Done 0, errors on 1
mmcauliffe commented 1 year ago

Ah, right, can you try rerunning with a higher beam width mfa align ... --beam 100 and see if it succeeds? 4 minutes a bit on the long side, but 100 beam width should be enough (you can also boost it even higher, the default is 10, which is pretty strict, but faster)

hailthedawn commented 1 year ago

Thanks for the reply! No, I'm getting the same issue when I run: mfa align -t ./temp -j 2 ./temp_mfa modified_librispeech-lexicon.txt ./english.zip ./ljs_alignedcapstone --beam 100

I even tried 300 but no go.

mmcauliffe commented 1 year ago

Hmm, can you try running mfa download acoustic english_us_arpa and mfa download dictionary english_us_arpa and then try running mfa align -t ./temp -j 2 ./temp_mfa english_us_arpa english_us_arpa ./ljs_alignedcapstone --beam 100? The MFA 1.0 model that notebook uses is a while out of date and there's been a number of improvements to the models and dictionaries since then.

hailthedawn commented 1 year ago

Downloaded the model and dictionary, and am now getting this error:

$ mfa align -t ./temp -j 2 ./temp_mfa english_us_arpa english_us_arpa ./ljs_alignedcapstone --beam 100
 WARNING  The previous run had a different configuration than the current, which may cause issues. Please see the log for details or use --clean flag if issues 
          are encountered.
 WARNING  The previous run had a different configuration than the current, which may cause issues. Please see the log for details or use --clean flag if issues 
          are encountered.
 ERROR    There was an error in the run, please see the log.
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x00000263365E6410>>
Traceback (most recent call last):
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\mfa.py", line 97, in history_save_handler
    raise self.exception
  File "C:\Users\Ketaki\anaconda3\envs\aligner\Scripts\mfa-script.py", line 10, in <module>
    sys.exit(mfa_cli())
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\rich_click\rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 113, in align_corpus_cli
    aligner.align()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\pretrained.py", line 411, in align
    self.setup()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\pretrained.py", line 205, in setup
    self.load_corpus()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\corpus\acoustic_corpus.py", line 1205, in load_corpus
    self.dictionary_setup()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\dictionary\multispeaker.py", line 555, in dictionary_setup
    conn.execute(sqlalchemy.insert(Word.__table__), word_objs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 1414, in execute
    return meth(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\sql\elements.py", line 486, in _execute_on_connection
    return connection._execute_clauseelement(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 1638, in _execute_clauseelement
    ret = self._execute_context(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 1837, in _execute_context
    return self._exec_insertmany_context(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 2103, in _exec_insertmany_context
    self._handle_dbapi_exception(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 2326, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\base.py", line 2100, in _exec_insertmany_context
    dialect.do_execute(cursor, sub_stmt, sub_params, context)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\sqlalchemy\engine\default.py", line 748, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "word_pkey"
DETAIL:  Key (id)=(1) already exists.

[SQL: INSERT INTO word (id, mapping_id, word, count, word_type, dictionary_id) VALUES (%(id__0)s, %(mapping_id__0)s, %(word__0)s, %(count__0)s, %(word_type__0)s, %(dictionary_id__0)s), (%(id__1)s, %(mapping_id__1)s, %(word__1)s, %(count__1)s, %(word_type__ ... 110068 characters truncated ... 9)s, %(mapping_id__999)s, %(word__999)s, %(count__999)s, %(word_type__999)s, %(dictionary_id__999)s)]
[parameters: {'word__0': '<eps>', 'word_type__0': 'silence', 'count__0': 0, 'id__0': 1, 'mapping_id__0': 0, 'dictionary_id__0': 2, 'word__1': "'d", 'word_type__1': 'clitic', 'count__1': 0, 'id__1': 2, 'mapping_id__1': 1, 'dictionary_id__1': 2, 'word__2': "'ll", 'word_type__2': 'clitic', 'count__2': 0, 'id__2': 3, 'mapping_id__2': 2, 'dictionary_id__2': 2, 'word__3': "'re", 'word_type__3': 'clitic', 'count__3': 0, 'id__3': 4, 'mapping_id__3': 3, 'dictionary_id__3': 2, 'word__4': "'s", 'word_type__4': 'clitic', 'count__4': 0, 'id__4': 5, 'mapping_id__4': 4, 'dictionary_id__4': 2, 'word__5': "'ve", 'word_type__5': 'clitic', 'count__5': 0, 'id__5': 6, 'mapping_id__5': 5, 'dictionary_id__5': 2, 'word__6': 'a', 'word_type__6': 'speech', 'count__6': 0, 'id__6': 7, 'mapping_id__6': 6, 'dictionary_id__6': 2, 'word__7': "a''s", 'word_type__7': 'speech', 'count__7': 0, 'id__7': 8, 'mapping_id__7': 7, 'dictionary_id__7': 2, 'word__8': "a'body", 'word_type__8': 'speech' ... 5900 parameters truncated ... 'mapping_id__991': 991, 'dictionary_id__991': 2, 'word__992': 'achiever', 'word_type__992': 'speech', 'count__992': 0, 'id__992': 993, 'mapping_id__992': 992, 'dictionary_id__992': 2, 'word__993': 'achievers', 'word_type__993': 'speech', 'count__993': 0, 'id__993': 994, 'mapping_id__993': 993, 'dictionary_id__993': 2, 'word__994': 'achieves', 'word_type__994': 'speech', 'count__994': 0, 'id__994': 995, 'mapping_id__994': 994, 'dictionary_id__994': 2, 'word__995': 'achieving', 'word_type__995': 'speech', 'count__995': 0, 'id__995': 996, 'mapping_id__995': 995, 'dictionary_id__995': 2, 'word__996': 'achill', 'word_type__996': 'speech', 'count__996': 0, 'id__996': 997, 'mapping_id__996': 996, 'dictionary_id__996': 2, 'word__997': 'achillas', 'word_type__997': 'speech', 'count__997': 0, 'id__997': 998, 'mapping_id__997': 997, 'dictionary_id__997': 2, 'word__998': 'achille', 'word_type__998': 'speech', 'count__998': 0, 'id__998': 999, 'mapping_id__998': 998, 'dictionary_id__998': 2, 'word__999': "achille's", 'word_type__999': 'speech', 'count__999': 0, 'id__999': 1000, 'mapping_id__999': 999, 'dictionary_id__999': 2}]
(Background on this error at: https://sqlalche.me/e/20/gkpj)
mmcauliffe commented 1 year ago

Right, include the --clean flag via: mfa align -t ./temp -j 2 ./temp_mfa english_us_arpa english_us_arpa ./ljs_alignedcapstone --beam 100 --clean, since it's still expecting the same dictionary with that dataset.

hailthedawn commented 1 year ago

Getting this now:

$ mfa align -t ./temp -j 2 ./temp_mfa english_us_arpa english_us_arpa ./ljs_alignedcapstone --beam 100 --clean
 ERROR    There was an error connecting to the global MFA database server.
 ERROR    Please ensure the server is initialized (mfa server init) or running (mfa server start)
 ERROR    There was an error in the run, please see the log.
Exception ignored in atexit callback: <bound method ExitHooks.history_save_handler of <montreal_forced_aligner.command_line.mfa.ExitHooks object at 0x000001AACCA46410>>
Traceback (most recent call last):
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\mfa.py", line 97, in history_save_handler
    raise self.exception
  File "C:\Users\Ketaki\anaconda3\envs\aligner\Scripts\mfa-script.py", line 10, in <module>
    sys.exit(mfa_cli())
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\rich_click\rich_group.py", line 21, in main
    rv = super().main(*args, standalone_mode=False, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\click\decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\command_line\align.py", line 113, in align_corpus_cli
    aligner.align()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\alignment\pretrained.py", line 405, in align
    self.initialize_database()
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\site-packages\montreal_forced_aligner\abc.py", line 241, in initialize_database
    subprocess.check_call(
  File "C:\Users\Ketaki\anaconda3\envs\aligner\lib\subprocess.py", line 369, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['createdb', '--host=C:/Users/Ketaki/Documents/MFA/pg_mfa_global_socket', 'temp_mfa']' returned non-zero exit status 1.
mmcauliffe commented 1 year ago

hmm, is this on the same machine as mfa configure --enable_auto_server? Is there anything in the pg logs of the temp directory?

hailthedawn commented 1 year ago

Ran the auto_server and it's still the same error. this is my align.1 log:

'C:\Users\Ketaki\anaconda3\envs\aligner\Library\bin\gmm-boost-silence.EXE' --boost=1.0 1 'temp\temp_mfa\alignment\final.alimdl' - 
'C:\Users\Ketaki\anaconda3\envs\aligner\Library\bin\gmm-align-compiled.EXE' --transition-scale=1.0 --acoustic-scale=0.083333 --self-loop-scale=0.1 --beam=100 --retry-beam=40 --careful=false '--write-per-frame-acoustic-loglikes=ark:temp\temp_mfa\alignment\like.1.1.ark' - 'ark,s,cs:temp\temp_mfa\alignment\fsts.1.1.ark' 'ark,s,cs:splice-feats --left-context=3 --right-context=3 scp,s,cs:"temp\temp_mfa\temp_mfa\split2\feats.1.1.scp" ark:- | transform-feats "temp\temp_mfa\alignment\lda.mat" ark:- ark:- |' 'ark:temp\temp_mfa\alignment\ali.1.1.ark' ark,t:- 
WARNING (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
LOG (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1
LOG (gmm-boost-silence.EXE[5.5.1016]:main():gmmbin\gmm-boost-silence.cc:103) Wrote model to -
splice-feats --left-context=3 --right-context=3 'scp,s,cs:temp\temp_mfa\temp_mfa\split2\feats.1.1.scp' ark:- 
transform-feats 'temp\temp_mfa\alignment\lda.mat' ark:- ark:- 
LOG (transform-feats[5.5.1016]:main():featbin\transform-feats.cc:158) Overall average [pseudo-]logdet is -89.6349 over 25547 frames.
LOG (transform-feats[5.5.1016]:main():featbin\transform-feats.cc:161) Applied transform to 1 utterances; 0 had errors.
LOG (gmm-align-compiled.EXE[5.5.1016]:main():gmmbin\gmm-align-compiled.cc:127) 1-1
ERROR (gmm-align-compiled.EXE[5.5.1016]:kaldi::AlignUtteranceWrapper():decoder\decoder-wrappers.cc:594) Beams do not make sense: beam 100, retry-beam 40
kaldi::KaldiFatalError

This is my pg-log-global: https://gist.github.com/hailthedawn/886ffcdee7cd8593c7c48dcb48e2ac7f

pg_init_log_global reports no errors.

hailthedawn commented 1 year ago

@mmcauliffe Hey! Any idea what might be happening here?

hailthedawn commented 1 year ago

I have figured out that when I only align a section of the audio, and the section doesn't contain any disfluencies (eg - "mmhmm"), it does not crash, and runs properly. As soon as I use a section of the audio that has a disfluency, it crashes. (Even if my transcript doesn't contain the disfluency). Still not sure how to make it work for the full audio. (I don't want to take out all disfluencies). I can try adding all disfluencies in my data to the ARPA dictionary, but not sure if that would work if the phones aren't present in ARPA.

Note: Audio also contains "um" and "uh" and I haven't verified if it crashes for those yet. It crashes for "mmhmm" currently

mmcauliffe commented 1 year ago

Right, so the alignment algorithm currently assumes that the transcripts have all of the words it's looking for, including disfluencies and filled pauses. I have some code that I'm playing around with for designating some words/multiword sequences like "you know" or "I mean" in English as interjections that might not be transcribed (because the Japanese corpus I'm working on often doesn't include them), but there's still some implementation issues I need to figure out.

I didn't realize initially that it had multiple speakers in the file, which may also cause issues with speaker adaptation later, so I would recommend breaking up the file and using the TextGrid input format here: https://montreal-forced-aligner.readthedocs.io/en/latest/user_guide/corpus_structure.html#textgrid-format so that you can assign intervals to each speaker. I have an early, early alpha build of the program that I use to create and fix corpora here: https://anchor-annotator.readthedocs.io/en/latest/, and you can quickly split up utterance in that.

Additionally, if you have more data for each speaker than just this file, it'll likely be better alignments overall.

hailthedawn commented 1 year ago

Hi, thank you! I stuck to doing text-based alignment, and was able to split the file per speaker turn, improve my transcriptions, and now have only ~20 alignments failing out of 1030. (By failing, I mean the TextGrid doesn't get generated).

However, for quite a few of the utterances that are short (1 word long), silence isn't detected properly near the end of the audio file. In some of them, a single laugh is reported as taking up a full 20 seconds (which was the length of the segmented file I passed in). When I play the audio manually, the laugh takes up all of a half second. I tried fiddling with the boost_silence parameter, even going up as high as 70, but did not see much improvement (maybe 0.5 seconds or so). Do you recommend I manually go through all of these, or try to use longer utterances?

299792459b commented 1 year ago

@hailthedawn

Hi there,

How are you able to get the notebook to work? It fails on the last step for me. See below:

The global MFA database server does not exist, initializing it first. pg_ctl stdout: pg_ctl stderr: initdb: error: cannot be run as root initdb: hint: Please log in (using, e.g., "su") as the (unprivileged) user that will own the server process.

Traceback (most recent call last): File "/tmp/mfa/miniconda3/envs/aligner/bin/mfa", line 10, in sys.exit(mfa_cli()) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1130, in call return self.main(*args, kwargs) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/rich_click/rich_group.py", line 21, in main rv = super().main(args, standalone_mode=False, kwargs) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1654, in invoke super().invoke(ctx) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, ctx.params) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/core.py", line 760, in invoke return __callback(args, kwargs) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func return f(get_current_context(), *args, **kwargs) File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/command_line/mfa.py", line 146, in mfa_cli start_server() File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/command_line/utils.py", line 408, in start_server initialize_server() File "/tmp/mfa/miniconda3/envs/aligner/lib/python3.10/site-packages/montreal_forced_aligner/command_line/utils.py", line 320, in initialize_server raise DatabaseError( montreal_forced_aligner.exceptions.DatabaseError: DatabaseError:

There was an error encountered starting the global MFA database server, please see /root/Documents/MFA/pg_init_log_global.txt for more details and/or look at the logged errors above. See output files at ./ljs_aligned

ChosenOne23 commented 1 year ago

i got the same problem(There was an error encountered starting the global MFA database server),how can i fix it?

frouhi commented 1 year ago

initdb: hint: Please log in (using, e.g., "su") as the (unprivileged) user that will own the server process.

having the same issue! There has to be a way to run it without sudo, right?

mmcauliffe commented 1 year ago

@299792459b It looks like you're running on docker, in which case I would recommend using https://hub.docker.com/repository/docker/mmcauliffe/montreal-forced-aligner/general or looking at https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html#installing-mfa-in-your-own-containers. MFA should not be running as root.

Not sure if others hitting this are in the same situation, but I am going to close this as the original issue should be solved or at least better with the latest MFA model, but the root cause is really that MFA relies on accurate speaker labels heavily. For those encountering error as root users and the links above don't help, feel free to make other issues.