Closed sih4sing5hong5 closed 6 years ago
收到,這就來接
sw02001-A_000000-005644 pe5 試跑音檔 230542167L 結果只輸出一個字 https://www.dropbox.com/s/oblr24g82n7qlel/230542167L.wav?dl=0
root@2ee32b63c9c8:/usr/local/kaldi/egs/formosa/s5/exp/chain/tdnn_1a_sp/decode_tshi/scoring# cat 7.0.0.txt sw02001-A_000000-005644 pe5
root@2ee32b63c9c8:/usr/local/kaldi/egs/formosa/s5# bash -x decode_nnet3.sh exp/chain/tdnn_1a_sp/graph_test/ exp/chain/tdnn_1a_sp/lang-3grams/ tshi3/train_free exp/chain/tdnn_1a_sp/decode_tshi
+ . cmd.sh
++ export train_cmd=run.pl
++ train_cmd=run.pl
++ export 'decode_cmd=run.pl --mem 4G'
++ decode_cmd='run.pl --mem 4G'
++ export 'mkgraph_cmd=run.pl --mem 8G'
++ mkgraph_cmd='run.pl --mem 8G'
++ export 'cuda_cmd=run.pl --gpu 1'
++ cuda_cmd='run.pl --gpu 1'
+ . path.sh
+++ pwd
++ export KALDI_ROOT=/usr/local/kaldi/egs/formosa/s5/../../..
++ KALDI_ROOT=/usr/local/kaldi/egs/formosa/s5/../../..
++ '[' -f /usr/local/kaldi/egs/formosa/s5/../../../tools/env.sh ']'
++ export PATH=/usr/local/kaldi/egs/formosa/s5/utils/:/usr/local/kaldi/egs/formosa/s5/../../../tools/openfst/bin:/usr/local/kaldi/egs/formosa/s5:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ PATH=/usr/local/kaldi/egs/formosa/s5/utils/:/usr/local/kaldi/egs/formosa/s5/../../../tools/openfst/bin:/usr/local/kaldi/egs/formosa/s5:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ '[' '!' -f /usr/local/kaldi/egs/formosa/s5/../../../tools/config/common_path.sh ']'
++ . /usr/local/kaldi/egs/formosa/s5/../../../tools/config/common_path.sh
+++ '[' -z /usr/local/kaldi/egs/formosa/s5/../../.. ']'
+++ export PATH=/usr/local/kaldi/egs/formosa/s5/../../../src/bin:/usr/local/kaldi/egs/formosa/s5/../../../src/chainbin:/usr/local/kaldi/egs/formosa/s5/../../../src/featbin:/usr/local/kaldi/egs/formosa/s5/../../../src/fgmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/fstbin:/usr/local/kaldi/egs/formosa/s5/../../../src/gmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/ivectorbin:/usr/local/kaldi/egs/formosa/s5/../../../src/kwsbin:/usr/local/kaldi/egs/formosa/s5/../../../src/latbin:/usr/local/kaldi/egs/formosa/s5/../../../src/lmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnet2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnet3bin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnetbin:/usr/local/kaldi/egs/formosa/s5/../../../src/online2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/onlinebin:/usr/local/kaldi/egs/formosa/s5/../../../src/rnnlmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/sgmm2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/sgmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/tfrnnlmbin:/usr/local/kaldi/egs/formosa/s5/utils/:/usr/local/kaldi/egs/formosa/s5/../../../tools/openfst/bin:/usr/local/kaldi/egs/formosa/s5:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
+++ PATH=/usr/local/kaldi/egs/formosa/s5/../../../src/bin:/usr/local/kaldi/egs/formosa/s5/../../../src/chainbin:/usr/local/kaldi/egs/formosa/s5/../../../src/featbin:/usr/local/kaldi/egs/formosa/s5/../../../src/fgmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/fstbin:/usr/local/kaldi/egs/formosa/s5/../../../src/gmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/ivectorbin:/usr/local/kaldi/egs/formosa/s5/../../../src/kwsbin:/usr/local/kaldi/egs/formosa/s5/../../../src/latbin:/usr/local/kaldi/egs/formosa/s5/../../../src/lmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnet2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnet3bin:/usr/local/kaldi/egs/formosa/s5/../../../src/nnetbin:/usr/local/kaldi/egs/formosa/s5/../../../src/online2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/onlinebin:/usr/local/kaldi/egs/formosa/s5/../../../src/rnnlmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/sgmm2bin:/usr/local/kaldi/egs/formosa/s5/../../../src/sgmmbin:/usr/local/kaldi/egs/formosa/s5/../../../src/tfrnnlmbin:/usr/local/kaldi/egs/formosa/s5/utils/:/usr/local/kaldi/egs/formosa/s5/../../../tools/openfst/bin:/usr/local/kaldi/egs/formosa/s5:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
++ export LC_ALL=C
++ LC_ALL=C
+ set -e
+ nj=1
+ online_ivector_dir=exp/nnet3/ivectors_test
+ tshi3=tshi3/train_free
+ utils/utt2spk_to_spk2utt.pl tshi3/train_free/utt2spk
+ utils/fix_data_dir.sh tshi3/train_free
fix_data_dir.sh: kept all 1 utterances.
fix_data_dir.sh: old files are kept in tshi3/train_free/.backup
+ mfccdir=tshi3/train_free/mfcc
+ make_mfcc_log=tshi3/train_free/make_mfcc/
+ steps/make_mfcc_pitch.sh --nj 1 --mfcc-config conf/mfcc_hires.conf --cmd run.pl tshi3/train_free tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
steps/make_mfcc_pitch.sh --nj 1 --mfcc-config conf/mfcc_hires.conf --cmd run.pl tshi3/train_free tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
utils/validate_data_dir.sh: WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in http://kaldi-asr.org/doc/data_prep.html
for more information.
utils/validate_data_dir.sh: Successfully validated data-directory tshi3/train_free
steps/make_mfcc_pitch.sh [info]: segments file exists: using that.
Succeeded creating MFCC & Pitch features for train_free
+ steps/compute_cmvn_stats.sh tshi3/train_free tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
steps/compute_cmvn_stats.sh tshi3/train_free tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
Succeeded creating CMVN stats for train_free
+ utils/data/limit_feature_dim.sh 0:39 tshi3/train_free tshi3/train_free_nopitch
utils/copy_data_dir.sh: copied data from tshi3/train_free to tshi3/train_free_nopitch
utils/validate_data_dir.sh: WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in http://kaldi-asr.org/doc/data_prep.html
for more information.
utils/validate_data_dir.sh: Successfully validated data-directory tshi3/train_free_nopitch
utils/data/limit_feature_dim.sh: warning: removing tshi3/train_free_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: WARNING: you have only one speaker. This probably a bad idea.
Search for the word 'bold' in http://kaldi-asr.org/doc/data_prep.html
for more information.
utils/validate_data_dir.sh: Successfully validated data-directory tshi3/train_free_nopitch
+ steps/compute_cmvn_stats.sh tshi3/train_free_nopitch tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
steps/compute_cmvn_stats.sh tshi3/train_free_nopitch tshi3/train_free/make_mfcc/ tshi3/train_free/mfcc
Succeeded creating CMVN stats for train_free_nopitch
+ steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --nj 1 tshi3/train_free_nopitch exp/nnet3/extractor exp/nnet3/ivectors_test
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --nj 1 tshi3/train_free_nopitch exp/nnet3/extractor exp/nnet3/ivectors_test
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_test using the extractor in exp/nnet3/extractor.
+ graph_dir=exp/chain/tdnn_1a_sp/graph_test/
+ lang_dir=exp/chain/tdnn_1a_sp/lang-3grams/
+ decode_dir=exp/chain/tdnn_1a_sp/decode_tshi
+ mkdir -p exp/chain/tdnn_1a_sp/decode_tshi/scoring/
++ cat exp/nnet3/ivectors_test/ivector_period
+ ivector_period=10
+ ivector_opts='--online-ivectors=scp:exp/nnet3/ivectors_test/ivector_online.scp --online-ivector-period=10'
+ nnet3-latgen-faster --frame-subsampling-factor=3 --frames-per-chunk=51 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=1.0 --allow-partial=true --word-symbol-table=exp/chain/tdnn_1a_sp/graph_test//words.txt --online-ivectors=scp:exp/nnet3/ivectors_test/ivector_online.scp --online-ivector-period=10 exp/chain/tdnn_1a_sp/graph_test//../final.mdl exp/chain/tdnn_1a_sp/graph_test//HCLG.fst 'ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:tshi3/train_free/utt2spk scp:tshi3/train_free/cmvn.scp scp:tshi3/train_free/feats.scp ark:- |' 'ark:|lattice-scale --acoustic-scale=10.0 ark:- ark:- | gzip -c > exp/chain/tdnn_1a_sp/decode_tshi/lat1.1.gz'
+ tee exp/chain/tdnn_1a_sp/decode_tshi/a.log
nnet3-latgen-faster --frame-subsampling-factor=3 --frames-per-chunk=51 --extra-left-context=0 --extra-right-context=0 --extra-left-context-initial=-1 --extra-right-context-final=-1 --minimize=false --max-active=7000 --min-active=200 --beam=15.0 --lattice-beam=8.0 --acoustic-scale=1.0 --allow-partial=true --word-symbol-table=exp/chain/tdnn_1a_sp/graph_test//words.txt --online-ivectors=scp:exp/nnet3/ivectors_test/ivector_online.scp --online-ivector-period=10 exp/chain/tdnn_1a_sp/graph_test//../final.mdl exp/chain/tdnn_1a_sp/graph_test//HCLG.fst 'ark,s,cs:apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:tshi3/train_free/utt2spk scp:tshi3/train_free/cmvn.scp scp:tshi3/train_free/feats.scp ark:- |' 'ark:|lattice-scale --acoustic-scale=10.0 ark:- ark:- | gzip -c > exp/chain/tdnn_1a_sp/decode_tshi/lat1.1.gz'
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:RemoveOrphanNodes():nnet-nnet.cc:948) Removed 9 orphan nodes.
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:RemoveOrphanComponents():nnet-nnet.cc:847) Removing 19 orphan components.
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:Collapse():nnet-utils.cc:1336) Added 10 components, removed 19
lattice-scale --acoustic-scale=10.0 ark:- ark:-
apply-cmvn --norm-means=false --norm-vars=false --utt2spk=ark:tshi3/train_free/utt2spk scp:tshi3/train_free/cmvn.scp scp:tshi3/train_free/feats.scp ark:-
LOG (apply-cmvn[5.4.276~1403-87d3f]:main():apply-cmvn.cc:81) Copied 1 utterances.
sw02001-A_000000-005644 pe7
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:DecodeUtteranceLatticeFaster():decoder-wrappers.cc:286) Log-like per frame for utterance sw02001-A_000000-005644 is 5.30409 over 1881 frames.
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:main():nnet3-latgen-faster.cc:255) Time taken 7.81404s: real-time factor assuming 100 frames/sec is 0.138473
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:main():nnet3-latgen-faster.cc:258) Done 1 utterances, failed for 0
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:main():nnet3-latgen-faster.cc:260) Overall log-likelihood per frame is 5.30409 over 1881 frames.
LOG (nnet3-latgen-faster[5.4.276~1403-87d3f]:~CachingOptimizingCompiler():nnet-optimize.cc:698) 0.0438 seconds taken in nnet3 compilation total (breakdown: 0.0257 compilation, 0.0126 optimization, 0 shortcut expansion, 0.00291 checking, 1.41e-05 computing indexes, 0.00253 misc.) + 0 I/O.
LOG (lattice-scale[5.4.276~1403-87d3f]:main():lattice-scale.cc:107) Done 1 lattices.
+ lattice-lmrescore --lm-scale=-1.0 'ark:gunzip -c exp/chain/tdnn_1a_sp/decode_tshi/lat1.1.gz|' 'fstproject --project_output=true exp/chain/tdnn_1a_sp/lang-3grams//G.fst |' ark:-
+ lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- exp/chain/tdnn_1a_sp/lang-3grams//G.carpa 'ark,t:|gzip -c> exp/chain/tdnn_1a_sp/decode_tshi/lat3.1.gz'
lattice-lmrescore --lm-scale=-1.0 'ark:gunzip -c exp/chain/tdnn_1a_sp/decode_tshi/lat1.1.gz|' 'fstproject --project_output=true exp/chain/tdnn_1a_sp/lang-3grams//G.fst |' ark:-
lattice-lmrescore-const-arpa --lm-scale=1.0 ark:- exp/chain/tdnn_1a_sp/lang-3grams//G.carpa 'ark,t:|gzip -c> exp/chain/tdnn_1a_sp/decode_tshi/lat3.1.gz'
LOG (lattice-lmrescore[5.4.276~1403-87d3f]:main():lattice-lmrescore.cc:148) Done 1 lattices, failed for 0
LOG (lattice-lmrescore-const-arpa[5.4.276~1403-87d3f]:main():lattice-lmrescore-const-arpa.cc:117) Done 1 lattices, failed for 0
+ lattice-scale --inv-acoustic-scale=13 'ark:gunzip -c exp/chain/tdnn_1a_sp/decode_tshi/lat3.1.gz|' ark:-
+ lattice-add-penalty --word-ins-penalty=0.0 ark:- ark:-
+ lattice-best-path --word-symbol-table=exp/chain/tdnn_1a_sp/graph_test//words.txt ark:- ark,t:-
+ utils/int2sym.pl -f 2- exp/chain/tdnn_1a_sp/graph_test//words.txt
+ tee exp/chain/tdnn_1a_sp/decode_tshi/scoring/7.0.0.txt
lattice-scale --inv-acoustic-scale=13 'ark:gunzip -c exp/chain/tdnn_1a_sp/decode_tshi/lat3.1.gz|' ark:-
lattice-best-path --word-symbol-table=exp/chain/tdnn_1a_sp/graph_test//words.txt ark:- ark,t:-
lattice-add-penalty --word-ins-penalty=0.0 ark:- ark:-
LOG (lattice-scale[5.4.276~1403-87d3f]:main():lattice-scale.cc:107) Done 1 lattices.
LOG (lattice-add-penalty[5.4.276~1403-87d3f]:main():lattice-add-penalty.cc:62) Done adding word insertion penalty to 1 lattices.
LOG (lattice-best-path[5.4.276~1403-87d3f]:main():lattice-best-path.cc:99) For utterance sw02001-A_000000-005644, best cost 1316.16 + -8687.54 = -7371.38 over 1881 frames.
sw02001-A_000000-005644 pe5
LOG (lattice-best-path[5.4.276~1403-87d3f]:main():lattice-best-path.cc:124) Overall cost per frame is -3.91886 = 0.699713 [graph] + -4.61857 [acoustic] over 1881 frames.
LOG (lattice-best-path[5.4.276~1403-87d3f]:main():lattice-best-path.cc:128) Done 1 lattices, failed for 0
sw02001-A_000000-005644 pe5
我食飽了後來看
image 你push到 dockerhub.iis ,我才有法度看
你全部有做的操作script攏貼起來好--無?
wget…
docker image的wav
07d794af944e5bea67511519d1da8c9b tshi3/train_free/230542167L.wav
dropbox的wav
60ce992f85246bf690decfd05cb0b236 230542167L.wav
07d794af944e5bea67511519d1da8c9b 之前辨得出來
cd /usr/local/kaldi/egs/formosa/s5
mkdir tshi3/train_free exp/chain/tdnn_1a_sp/decode_tshi
cp 準備好的kaldi資料 tshi3/train_free
bash -x decode_nnet3.sh exp/chain/tdnn_1a_sp/graph_test/ exp/chain/tdnn_1a_sp/lang-3grams/ tshi3/train_free exp/chain/tdnn_1a_sp/decode_tshi
準備好的kaldi資料
# reco2file_and_channel
sw02001-A sw02001 A
# segments
sw02001-A_000000-005644 sw02001-A 0.000 56.440000
# text
sw02001-A_000000-005644
# utt2spk
sw02001-A_000000-005644 2001-A
# wav.scp
sw02001-A sox -G /usr/local/kaldi/egs/formosa/s5/tshi3/train_free/230542167L.wav -b 16 -c 1 -r 8k -t wav - |
你試走tw01test的檔,看結果按怎
結果正常耶!那我來比較一下這兩個音檔哪不對好了,感謝老大指出
➜ 000x sox --i 000.wav
Input File : '000.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:01.10 = 17664 samples ~ 82.8 CDDA sectors
File Size : 35.4k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
➜ 000x sox --i 230542167L.wav
Input File : '230542167L.wav'
Channels : 1
Sample Rate : 8000
Precision : 13-bit
Duration : 00:00:56.44 = 451520 samples ~ 4233 CDDA sectors
File Size : 452k
Bit Rate : 64.0k
Sample Encoding: 8-bit A-law
原檔 8 bit a-law ...!! 那麼 wav.scp我推測應是 -b 8 才行 sw02001-A sox -G /usr/local/kaldi/egs/taiwanese/s5c/exp/tri4/try/230542167L.wav -b 8 -c 1 -r 8K -t wav - |
sw02001-A sox -G PATH/230542167L.wav -b 8 -c 1 -r 8K -t wav - | 也不對,請老大指點了
照原本的wav.scp -b 16
Yang-Hsiang Chang notifications@github.com 於 2018年10月11日 週四 下午6:36寫道:
sw02001-A sox -G PATH/230542167L.wav -b 8 -c 1 -r 8K -t wav - | 也不對,請老大指點了
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/twgo/twgo-exp/issues/111#issuecomment-428906802, or mute the thread https://github.com/notifications/unsubscribe-auth/AFuAC0StU4XS1euwHJ9vAKlAB5-6_0eiks5ujx8agaJpZM4XRuQT .
-- Sîng-hông
╯︵︵︵︵︵︵︵︵︵︵╰ |  ̄ ̄ ̄ ̄ |  ̄ ̄ ̄ ̄ | / \ | ((oo) | ((oo) | /︵︵︵︵︵︵︵︵︵︵︵\ | ____ | ____ | |||
---|---|---|---|---|---|---|---|---|---|---|---|
/_/\ /\/\ | |||||||||||
. . . . | /__/\ /___/\ | /___/\ | |||||||||
(( oo) (oo )) | ˙(oo)˙ ˋ(°oo ° )ノ ˋ(°oo ° )ノ |
docker exec -it ad45474c0ec6 bash
sw02001-A sox -G /usr/local/kaldi/egs/formosa/s5/tshi3/train_free/230542167L.wav -b 16 -c 1 -r 8k -t wav - |
結果仍是 pe5
有勞老大指點了
我可先來接及轉漢字
筆記
time docker run -ti --rm dockerhub.iis.sinica.edu.tw/dnn-test:93 bash
mv tshi3/train/text tshi3/train/text.ku
tail -n 1 tshi3/train/text.ku > tshi3/train/text
utils/fix_data_dir.sh tshi3/train/
mv *nnet3.sh nnet3.sh
sed "s/nj\=[0-9]\+/nj\=1/g" -i nnet3.sh
time bash -x nnet3.sh hethong/lang tshi3/train
wget https://github.com/sih4sing5hong5/kaldi/raw/taiwanese/egs/taiwanese/s5c/%E6%9C%8D%E5%8B%99%E4%BE%86%E8%A9%A6nnet3.sh -O decode.sh
time bash -x decode.sh exp/chain/tdnn_1a_sp/graph/ hethong/lang-3grams/ tshi3/train exp/chain/tdnn_1a_sp/decode_tshi
dockerhub.iis.sinica.edu.tw/hethong:203-8k
你先試看覓,指令參考https://github.com/twgo/pian7sik4_he7thong2/blob/master/Dockerfile#L20