alumae / kaldi-offline-transcriber

Offline transcription system for Estonian using Kaldi
Other
226 stars 57 forks source link

failed to transcribe an example from readme #26

Closed tigran10 closed 2 years ago

tigran10 commented 2 years ago

Hi,

Just trying to do a quick test of the readme, and hitting an error with the example there.

mkdir -p ~/tmp/speechfiles

wget http://media.kuku.ee/intervjuu/intervjuu2018080910.mp3

docker run --name speech2text -v ~/tmp/speechfiles:/opt/speechfiles --rm -d -t alumae/kaldi-offline-transcriber-et

docker exec -it speech2text /opt/kaldi-offline-transcriber/speech2text.sh --trs /opt/speechfiles/intervjuu2018080910.trs /opt/speechfiles/intervjuu2018080910.mp3

This is the error i am getting.

steps/nnet3/decode.sh --num-threads 1 --acwt 1.0  --post-decode-acwt 10.0 \
    --skip-scoring true --cmd "$decode_cmd" --nj 1 \
    --online-ivector-dir build/trans/intervjuu2018080910/ivectors \
    --skip-diagnostics true \
      build/fst/cnn_tdnn_1d_online/graph_prunedlm_unk build/trans/intervjuu2018080910 `dirname build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log` || exit 1;
steps/nnet3/decode.sh --num-threads 1 --acwt 1.0 --post-decode-acwt 10.0 --skip-scoring true --cmd run.pl --nj 1 --online-ivector-dir build/trans/intervjuu2018080910/ivectors --skip-diagnostics true build/fst/cnn_tdnn_1d_online/graph_prunedlm_unk build/trans/intervjuu2018080910 build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode
steps/nnet2/check_ivectors_compatible.sh: WARNING: One of the directories do not contain iVector ID.
steps/nnet2/check_ivectors_compatible.sh: WARNING: That means it's you who's reponsible for keeping
steps/nnet2/check_ivectors_compatible.sh: WARNING: the directories compatible
steps/nnet3/decode.sh: feature type is raw
bash: line 1:  1696 Killed                  ( nnet3-latgen-faster --online-ivectors=scp:build/trans/intervjuu2018080910/ivectors/ivector_online.scp --online-ivector-period=10 --frame-subsampling-factor=3 --frames-per-chunk=50 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=1.0 --allow-partial=true --word-symbol-table=build/fst/cnn_tdnn_1d_online/graph_prunedlm_unk/words.txt build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/final.mdl build/fst/cnn_tdnn_1d_online/graph_prunedlm_unk/HCLG.fst "ark,s,cs:apply-cmvn --config=conf/online_cmvn.conf --utt2spk=ark:build/trans/intervjuu2018080910/split1/1/utt2spk scp:build/trans/intervjuu2018080910/split1/1/cmvn.scp scp:build/trans/intervjuu2018080910/split1/1/feats.scp ark:- |" "ark:|lattice-scale --acoustic-scale=10.0 ark:- ark:- | gzip -c >build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/lat.1.gz" ) 2>> build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log/decode.1.log >> build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log/decode.1.log
run.pl: job failed, log is in build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log/decode.1.log
Makefile:304: recipe for target 'build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log' failed
make: *** [build/trans/intervjuu2018080910/cnn_tdnn_1d_online_pruned_unk/decode/log] Error 1
Finished transcribing, result is in files /opt/kaldi-offline-transcriber/build/output/intervjuu2018080910.{txt,json,trs,ctm,srt,with-compounds.ctm}
cp: cannot stat '/opt/kaldi-offline-transcriber/build/output/intervjuu2018080910.trs': No such file or directory

I tried to run the speech2text.sh script from the container to make sure i am not missing any required args, but usage seems to be alright as well

Usage: speech2text [options] <audiofile>
Options:
  --nthreads <n>                   # Use <n> threads in parallel for decoding
  --txt <txt-file>                 # Put the result in a simple text file
  --json <json-file>               # Put the result in JSON file
  --trs <trs-file>                 # Put the result in trs file (XML file for Transcriber)
  --ctm <ctm-file>                 # Put the result in CTM file (one line pwer word with timing information)
  --srt <srt-file>                 # Put the result in SRT file (subtitles for e.g. VLC)
  --with-compounds-ctm <ctm-file>  # Put the result in CTM file (with compound break symbols)
  --clean (true|false)  # Delete intermediate files generated during decoding (true by default)

And trying the command below hits the same error

 ./speech2text.sh --txt intervjuu2018080910.trs  /opt/speechfiles/intervjuu2018080910.mp3
alumae commented 2 years ago

How much RAM do you have?

tigran10 commented 2 years ago

I think resources are alright, docker is allocated 8CPU and 8GB RAM with 1GB Swap. There must be something related to docker for mac, which i know is unlikely, however, exactly the same test passed on Linux just now.