Closed liumengzhu closed 8 years ago
I cannot distribute the LDC datasets. You have to obtain the datasets on your side
Thanks anyway!
I have a question, space-char is the parameter of utils/ctc_compile_dict_token.sh,so what the space-char means,is it the " " in my text?Do I need to change " " to
yes, space is simply " " in your transcripts
Thanks!
This is my tr.iter1.log,and the TokenAcc<0, wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) is nan,is it right?
train-ctc-parallel --report-step=1000 --num-sequence=10 --frame-limit=1000000 --learn-rate=0.00004 --momentum=0.9 --verbose=1 'ark,s,cs:copy-feats scp:exp/train_char_l2_c200/train_local.scp ark:- | add-deltas ark:- ark:- |' 'ark:gunzip -c exp/train_char_l2_c200/labels.tr.gz|' exp/train_char_l2_c200/nnet/nnet.iter0 exp/train_char_l2_c200/nnet/nnet.iter1
copy-feats scp:exp/train_char_l2_c200/train_local.scp ark:-
add-deltas ark:- ark:-
LOG (train-ctc-parallel:main():train-ctc-parallel.cc:112) TRAINING STARTED
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 1010 sequences (1.99515Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -373.992%
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 2020 sequences (4.29921Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -379.689%
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 3030 sequences (6.78592Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -392.214%
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 4040 sequences (9.43905Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -395.523%
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 5050 sequences (12.2743Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -410.99%
VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 6060 sequences (15.3838Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -416.06%
LOG (copy-feats:main():copy-feats.cc:100) Copied 6292 feature matrices.
LOG (train-ctc-parallel:main():train-ctc-parallel.cc:197) ### Gradient stats :
Layer 1 :
wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_m_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
bias_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_i_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_f_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_o_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_x_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_m_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
bias_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_i_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_f_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_o_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
Layer 2 :
wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_m_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
bias_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_i_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_f_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_o_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_x_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
wei_gifo_m_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
bias_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_i_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_f_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
phole_o_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan )
Layer 3 :
LOG (train-ctc-parallel:main():train-ctc-parallel.cc:204) Done 6292 files, 0 with no targets, 0 with other errors. [TRAINING, 212.195 min, fps458.889] LOG (train-ctc-parallel:main():train-ctc-parallel.cc:210) TOKEN_ACCURACY >> -397.032% <<
Help! How to do with the syntax error?? [NOTE] TOKEN_ACCURACY refers to token accuracy, i.e., (1.0 - token_error_rate). EPOCH 1 RUNNING ... ENDS [2016-Apr-18 18:25:09]: lrate 4e-05, TRAIN ACCURACY -397.0320%, VALID ACCURACY -391.8390% EPOCH 2 RUNNING ... ENDS [2016-Apr-18 20:48:13]: lrate 4e-05, TRAIN ACCURACY -397.0320%, VALID ACCURACY -391.8390% (standard_in) 1: syntax error (standard_in) 1: syntax error steps/train_ctc_parallel.sh: line 162: [: too many arguments (standard_in) 1: syntax error steps/train_ctc_parallel.sh: line 174: [: 1: unary operator expected EPOCH 3 RUNNING ...
Your training seems to be broken. The reason for this could be manifold, e.g., mostly due to mistakes in data preparation. Are you running one of the Eesen recipes, or running it on your own data?
I run on my own data
I guess you are using CPU? Eesen does NOT support CPU-based training. For Eesen to work, you have to switch to a GPU.
So I should compile the Eesen with GPU again?
Yes.
ok! Thanks,You helped me a lot!
你好,我现在在学习essen目录下的HKUST中的脚本,遇到了中文语料数据准备的问题,看了你的问题,相信你应该解决了这个问题。可以帮助下我吗?假设我现在有个目录下有两个wav文件,一个是1.wav,另一个是2.wav,1.wav对应的文本是:”我为我是中国人而感到骄傲“,2.wav对应的文本是:“你好,我们交个朋友吧”。我该如何对这个目录下的文件进行处理呢?在数据准备阶段,音频文件和文本文件的格式是什么呢?多谢 @yajiemiao @ liumengzhu
hi! I'm a student in China,and I'm not a member of LDC,so I don't know the format of the text,can you provides me with an example?