MontrealCorpusTools / Montreal-Forced-Aligner

Command line utility for forced alignment using Kaldi
https://montrealcorpustools.github.io/Montreal-Forced-Aligner/
MIT License
1.29k stars 242 forks source link

[BUG] Tasks get stuck frequently #645

Closed lifeiteng closed 1 year ago

lifeiteng commented 1 year ago
2023-06-01 21:08:24,606 DEBUG [mixins.py:470] Alignment round took 557.842 seconds
 INFO     Calculating fMLLR for speaker adaptation...                                                                                                                                              2023-06-01 21:08:24,630 INFO [acoustic_corpus.py:739] Calculating fMLLR for speaker adaptation...
  56% _____________________________________________________________________________________________________________________________________________________ 199/355  [ 0:01:32 < 0:02:02 , 1 it/s ] 100% _____________________________________________________________________________________________________________________________________________________ 355/355  [ 0:03:57 < 0:00:00 , 1 it/s ]2023-06-01 21:12:23,580 DEBUG [acoustic_corpus.py:806] Fmllr calculation took 238.950 seconds
 INFO     Performing second-pass alignment...
2023-06-01 21:12:23,583 INFO [base.py:363] Performing second-pass alignment...
 INFO     Generating alignments...                                                                                                                                                                 2023-06-01 21:12:23,587 INFO [mixins.py:424] Generating alignments...
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:39 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:39 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:40 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:40 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:40 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:21:40 < 0:14:11 , 131 it/s ]
  26% ____________________________________________________________________________________________________________________________________________ 38,809/150,097  [ 0:23:43 < 0:14:11 , 131 it/s ]

pg_log_global.txt

2023-06-01 21:08:16.113 CST [23193] LOG:  duration: 6482.452 ms  statement: 
            UPDATE utterance
            SET
                 "alignment_log_likelihood" = b."alignment_log_likelihood"
            FROM temp_utterance AS b
            WHERE utterance.id=b.id;

2023-06-01 21:08:24.077 CST [22247] LOG:  duration: 7901.456 ms  statement: UPDATE utterance SET alignment_log_likelihood=(utterance.alignment_log_likelihood / CAST(utterance.num_frames AS NUMERIC)) WHERE utterance.alignment_log_likelihood IS NOT NULL RETURNING utterance.id
2023-06-01 21:12:29.149 CST [22247] LOG:  duration: 5556.851 ms  statement: UPDATE utterance SET alignment_log_likelihood=NULL
lifeiteng commented 1 year ago

part of kaldi JOBs failed

Jun  1 23:10:36 kernel: [ 3284.095243] compile-train-g[10881]: segfault at 5567b3ba2000 ip 00007fce2a270ffc sp 00007fff078fc630 error 4 in libkaldi-decoder.so[7fce2a248000+13f000]
mmcauliffe commented 1 year ago

What version of kaldi is installed in the conda environment? We updated it to 5.5.1068 a few weeks ago. If you're running that, you could try installing the 5.5.1016 and see if that fixes it? (or upgrading it to the newest version if you're using 5.5.1016 still)

lifeiteng commented 1 year ago

@mmcauliffe kaldi version is 5.5.1074 commit id 71f38e62cad01c3078555bfe78d0f3a527422d75

mmcauliffe commented 1 year ago

Oh, did you compile it from source? In that case, the error's from changes since 1068 or in how it was compiled, so I would focus on fixing that or install the conda version.

lifeiteng commented 1 year ago

Oh, did you compile it from source? In that case, the error's from changes since 1068 or in how it was compiled, so I would focus on fixing that or install the conda version.

trying to use docker version but new errors

Job 8 encountered an error:
Traceback (most recent call last):

  File "/env/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 89, in run
    yield from self._run()

  File "/env/lib/python3.10/site-packages/montreal_forced_aligner/corpus/features.py", line 705, in _run
    self.check_call(copy_proc)

  File "/env/lib/python3.10/site-packages/montreal_forced_aligner/abc.py", line 116, in check_call
    raise KaldiProcessingError([self.log_path])

montreal_forced_aligner.exceptions.KaldiProcessingError: KaldiProcessingError:

There were 1 job(s) with errors when running Kaldi binaries.
See the log files below for more information.
/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/log/generate_final_features.8.log

$ cat /home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/log/generate_final_features.8.log
/env/bin/apply-cmvn --utt2spk=ark:/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/utt2spk.8.scp scp:/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/cmvn.8.scp scp:/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/feats.8.scp ark:-
/env/bin/copy-feats --compress=true ark:- ark,scp:/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/final_features.8.ark,/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/final_features.8.scp
ERROR (apply-cmvn[5.5.1016]:HasKey():util/kaldi-table-inl.h:2639) Attempting to read key 101-68298, which is not present in utt2spk map or similar map being read from ark:/home/feiteng/Documents/MFA/mls-english_train1_splits00/mls-english_train1_splits00/split8/utt2spk.8.scp
kaldi::KaldiFatalErrorLOG (copy-feats[5.5.1016]:main():featbin/copy-feats.cc:143) Copied 0 feature matrices.

split8/utt2spk.8.scp missing data.

lifeiteng commented 1 year ago

Oh, did you compile it from source? In that case, the error's from changes since 1068 or in how it was compiled, so I would focus on fixing that or install the conda version.

version 1068, still error

lifeiteng commented 1 year ago

There is a problem with the machine.