yajiemiao / eesen

The official repository of the Eesen project
Apache License 2.0
202 stars 72 forks source link

LDC #10

Closed liumengzhu closed 8 years ago

liumengzhu commented 8 years ago

hi! I'm a student in China,and I'm not a member of LDC,so I don't know the format of the text,can you provides me with an example?

yajiemiao commented 8 years ago

I cannot distribute the LDC datasets. You have to obtain the datasets on your side

liumengzhu commented 8 years ago

Thanks anyway!

liumengzhu commented 8 years ago

I have a question, space-char is the parameter of utils/ctc_compile_dict_token.sh,so what the space-char means,is it the " " in my text?Do I need to change " " to ?

yajiemiao commented 8 years ago

yes, space is simply " " in your transcripts

liumengzhu commented 8 years ago

Thanks! This is my tr.iter1.log,and the TokenAcc<0, wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) is nan,is it right? train-ctc-parallel --report-step=1000 --num-sequence=10 --frame-limit=1000000 --learn-rate=0.00004 --momentum=0.9 --verbose=1 'ark,s,cs:copy-feats scp:exp/train_char_l2_c200/train_local.scp ark:- | add-deltas ark:- ark:- |' 'ark:gunzip -c exp/train_char_l2_c200/labels.tr.gz|' exp/train_char_l2_c200/nnet/nnet.iter0 exp/train_char_l2_c200/nnet/nnet.iter1 copy-feats scp:exp/train_char_l2_c200/train_local.scp ark:- add-deltas ark:- ark:- LOG (train-ctc-parallel:main():train-ctc-parallel.cc:112) TRAINING STARTED VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 1010 sequences (1.99515Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -373.992% VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 2020 sequences (4.29921Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -379.689% VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 3030 sequences (6.78592Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -392.214% VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 4040 sequences (9.43905Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -395.523% VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 5050 sequences (12.2743Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -410.99% VLOG[1] (train-ctc-parallel:EvalParallel():ctc-loss.cc:182) After 6060 sequences (15.3838Hr): Obj(log[Pzx]) = -1e+30 TokenAcc = -416.06% LOG (copy-feats:main():copy-feats.cc:100) Copied 6292 feature matrices. LOG (train-ctc-parallel:main():train-ctc-parallel.cc:197) ### Gradient stats : Layer 1 : ,
wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_m_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) bias_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_i_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_f_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_o_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_x_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_m_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) bias_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_i_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_f_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_o_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) Layer 2 : ,
wei_gifo_x_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_m_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) bias_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_i_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_f_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_o_c_fwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_x_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) wei_gifo_m_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) bias_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_i_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_f_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) phole_o_c_bwcorr ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) Layer 3 : , linearity_grad ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) bias_grad ( min 0, max 0, mean 0, variance 0, skewness -nan, kurtosis -nan ) Layer 4 : ,

LOG (train-ctc-parallel:main():train-ctc-parallel.cc:204) Done 6292 files, 0 with no targets, 0 with other errors. [TRAINING, 212.195 min, fps458.889] LOG (train-ctc-parallel:main():train-ctc-parallel.cc:210) TOKEN_ACCURACY >> -397.032% <<

liumengzhu commented 8 years ago

Help! How to do with the syntax error?? [NOTE] TOKEN_ACCURACY refers to token accuracy, i.e., (1.0 - token_error_rate). EPOCH 1 RUNNING ... ENDS [2016-Apr-18 18:25:09]: lrate 4e-05, TRAIN ACCURACY -397.0320%, VALID ACCURACY -391.8390% EPOCH 2 RUNNING ... ENDS [2016-Apr-18 20:48:13]: lrate 4e-05, TRAIN ACCURACY -397.0320%, VALID ACCURACY -391.8390% (standard_in) 1: syntax error (standard_in) 1: syntax error steps/train_ctc_parallel.sh: line 162: [: too many arguments (standard_in) 1: syntax error steps/train_ctc_parallel.sh: line 174: [: 1: unary operator expected EPOCH 3 RUNNING ...

yajiemiao commented 8 years ago

Your training seems to be broken. The reason for this could be manifold, e.g., mostly due to mistakes in data preparation. Are you running one of the Eesen recipes, or running it on your own data?

liumengzhu commented 8 years ago

I run on my own data

yajiemiao commented 8 years ago

I guess you are using CPU? Eesen does NOT support CPU-based training. For Eesen to work, you have to switch to a GPU.

liumengzhu commented 8 years ago

So I should compile the Eesen with GPU again?

yajiemiao commented 8 years ago

Yes.

liumengzhu commented 8 years ago

ok! Thanks,You helped me a lot!

Sundy1219 commented 7 years ago

你好,我现在在学习essen目录下的HKUST中的脚本,遇到了中文语料数据准备的问题,看了你的问题,相信你应该解决了这个问题。可以帮助下我吗?假设我现在有个目录下有两个wav文件,一个是1.wav,另一个是2.wav,1.wav对应的文本是:”我为我是中国人而感到骄傲“,2.wav对应的文本是:“你好,我们交个朋友吧”。我该如何对这个目录下的文件进行处理呢?在数据准备阶段,音频文件和文本文件的格式是什么呢?多谢 @yajiemiao @ liumengzhu