MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.26k stars 242 forks source link

[BUG] MFA 3.0.6 unable to open database file #806

Closed my-yy closed 3 days ago

my-yy commented 1 month ago

Describe the issue MAF Version: 3.0.6 MFA broken during the training of the acoustic model:

Traceback (most recent call last):
  File "/zhangpai21/envs/aligner3/bin/mfa", line 10, in <module>
    sys.exit(mfa_cli())
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/rich_click/rich_command.py", line 126, in main
    rv = self.invoke(ctx)
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/command_line/train_acoustic_model.py", line 144, in train_acoustic_model_cli
    trainer.train()
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 529, in train
    self.align()
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/montreal_forced_aligner/acoustic_modeling/trainer.py", line 713, in align
    session.query(CorpusWorkflow).filter(CorpusWorkflow.id == wf.id).update(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3251, in update
    result: CursorResult[Any] = self.session.execute(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2306, in execute
    return self._execute_internal(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2191, in _execute_internal
    result: Result[Any] = compile_state_cls.orm_execute_statement(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/bulk_persistence.py", line 1617, in orm_execute_statement
    return super().orm_execute_statement(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/orm/context.py", line 293, in orm_execute_statement
    result = conn.execute(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1422, in execute
    return meth(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 514, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1644, in _execute_clauseelement
    ret = self._execute_context(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1850, in _execute_context
    return self._exec_single_context(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1990, in _exec_single_context
    self._handle_dbapi_exception(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2357, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1971, in _exec_single_context
    self.dialect.do_execute(
  File "/zhangpai21/envs/aligner3/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 919, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) unable to open database file
[SQL: UPDATE corpus_workflow SET dirty=? WHERE corpus_workflow.id = ?]
[parameters: (1, 16)]
(Background on this error at: https://sqlalche.me/e/20/e3q8)

For Reproducing your issue

Training Command:

path_corpus=/zhangpai21/webdataset/audio/fma_train_data/select_wav3kh
path_dict=/zhangpai21/workspace/cgy/1_projects/1_valle/lft_rep/test_mfa/my4.dict
path_model=/zhangpai21/workspace/cgy/1_projects/1_valle/lft_rep/test_mfa/model_trained_3kh
mfa train $path_corpus $path_dict $path_model --clean --num_jobs 100  --single_speaker
  1. Corpus structure The path_corpus contains the .wav and .lab files. The .wav is in Mandarin, and the .lab is the phoneme generated by a private g2p model. There are a total of 4,289,530 audio files in this folder, totaling approximately 3k hours. Lab file example:
    m ei3 d ao4 q van2 vn4 h uei4 sp q ing1 vn4 h uei4 d eng3 q van2 g uo2 x ing4 sp d a4 x ing2 sp s an1 sh iii4 j v3 b an4 q i1 sp

Since g2p is not needed, I use a one-to-one mapping dictionary based the solution in this issue. Dict example:

<pad>   <pad>
<unk>   <unk>
AA0 AA0
AA1 AA1
AA2 AA2
AE0 AE0
AE1 AE1
AE2 AE2
AH0 AH0
AH1 AH1
AH2 AH2
...

Platform:

fncokg commented 3 weeks ago

MFA makes use of a sqlite database to store its data, so make sure the sqlite database file does exist and maybe try sudo mfa tarin ... to ensure you really have the permissions to read/write that file.