yajiemiao / pdnn

PDNN: A Python Toolkit for Deep Learning. http://www.cs.cmu.edu/~ymiao/pdnntk.html
Apache License 2.0
224 stars 105 forks source link

Multi channel CNN #9

Open jeehye opened 9 years ago

jeehye commented 9 years ago

Hello I tried to use CNN part in PDNN. I have AMI corpus two channel speech filter-bank feature. (40 dimension filter-bank, delta and double-delta & splicing +-5 frames) So total dimension of my feature is 2x33x40.

And I convert feature to pfile, and give only 1 label. When I make pfile I use final.mdl file which is created by training single channel. Add channel 1 and channel 2 filter-bank => Create pfile with 1 label

--conv-nnet-spec "2x33x40:128,33x9,p1x3,f" \ --nnet-spec "1024:1024:$num_pdfs" \

During training CNN error decreases [2015-03-27 20:50:59.615275] > ... initializing the model [2015-03-27 20:50:59.704674] > ... getting the finetuning functions [2015-03-27 20:51:01.789192] > ... finetunning the model [2015-03-28 02:06:13.517982] > epoch 1, training error 85.005036 (%) [2015-03-28 02:15:04.781514] > epoch 1, lrate 0.080000, validation error 89.706542 (%) [2015-03-28 07:30:39.580127] > epoch 2, training error 80.255429 (%) [2015-03-28 07:39:31.307037] > epoch 2, lrate 0.080000, validation error 89.092275 (%) [2015-03-28 12:54:14.694241] > epoch 3, training error 79.482440 (%) [2015-03-28 13:03:04.171323] > epoch 3, lrate 0.080000, validation error 89.489965 (%) [2015-03-28 18:17:20.099473] > epoch 4, training error 78.750167 (%) [2015-03-28 18:26:11.689961] > epoch 4, lrate 0.080000, validation error 89.382604 (%) [2015-03-28 23:41:57.655036] > epoch 5, training error 78.529532 (%) [2015-03-28 23:50:46.840260] > epoch 5, lrate 0.080000, validation error 89.295978 (%) [2015-03-29 05:05:29.906363] > epoch 6, training error 78.304295 (%) [2015-03-29 05:14:21.794159] > epoch 6, lrate 0.040000, validation error 89.095217 (%) [2015-03-29 05:14:32.717682] > ... the final PDNN model parameter is exp/pdnn/pfile/nnet.param [2015-03-29 05:14:32.725299] > ... the final PDNN model config is exp/pdnn/pfile/nnet.cfg [2015-03-29 05:14:45.070086] > ... the final Kaldi model (only FC layers) is exp/pdnn/pfile/dnn.nnet

But when I test to check word error rate, error rate is 92%. What should I do to run multi channel CNN??