srvk / eesen

The official repository of the Eesen project
http://arxiv.org/abs/1507.08240
Apache License 2.0
824 stars 342 forks source link

KALDI_ASSERT: at train-ctc-parallel:AddMatMat:cuda-matrix.cc:570, failed: m == NumCols() #217

Open icestoneking opened 4 years ago

icestoneking commented 4 years ago

hi,alls I already install essen and I am trying to run the aishell, but I got the failed: train-ctc-parallel --report-step=160 --num-sequence=16 --frame-limit=1200 --learn-rate=0.003 --momentum=0.9 --verbose=1 'ark,s,cs:apply-cmvn --norm-vars=true --utt2spk=ark:data/train_nodev/utt2spk scp:data/train_nodev/cmvn.scp scp:exp/train_char_l5_c512/train.scp ark:- |' 'ark:gunzip -c exp/train_char_l5_c512/labels.tr.gz|' exp/train_char_l5_c512/nnet/nnet.iter0 exp/train_char_l5_c512/nnet/nnet.iter1 WARNING (train-ctc-parallel:SelectGpuId():cuda-device.cc:150) Suggestion: use 'nvidia-smi -c 1' to set compute exclusive mode LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:262) Selecting from 4 GPUs LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:277) cudaSetDevice(0): Tesla K80 free:11375M, used:64M, total:11439M, free/total:0.994373 LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:277) cudaSetDevice(1): Tesla K80 free:11375M, used:64M, total:11439M, free/total:0.994373 LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:277) cudaSetDevice(2): Tesla K80 free:11375M, used:64M, total:11439M, free/total:0.994373 LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:277) cudaSetDevice(3): Tesla K80 free:11375M, used:64M, total:11439M, free/total:0.994373 LOG (train-ctc-parallel:SelectGpuIdAuto():cuda-device.cc:310) Selected device: 0 (automatically) LOG (train-ctc-parallel:FinalizeActiveGpu():cuda-device.cc:194) The active GPU is [0]: Tesla K80 free:11358M, used:81M, total:11439M, free/total:0.992887 version 3.7 LOG (train-ctc-parallel:PrintMemoryUsage():cuda-device.cc:334) Memory used: 0 bytes. LOG (train-ctc-parallel:DisableCaching():cuda-device.cc:731) Disabling caching of GPU memory. apply-cmvn --norm-vars=true --utt2spk=ark:data/train_nodev/utt2spk scp:data/train_nodev/cmvn.scp scp:exp/train_char_l5_c512/train.scp ark:- LOG (train-ctc-parallel:main():train-ctc-parallel.cc:121) TRAINING STARTED KALDIASSERT: at train-ctc-parallel:AddMatMat:cuda-matrix.cc:570, failed: m == NumCols() Stack trace is: eesen::KaldiGetStackTrace() eesen::KaldiAssertFailure(char const, char const, int, char const) eesen::CuMatrixBase::AddMatMat(float, eesen::CuMatrixBase const&, eesen::MatrixTransposeType, eesen::CuMatrixBase const&, eesen::MatrixTransposeType, float) eesen::AffineTransform::BackpropagateFnc(eesen::CuMatrixBase const&, eesen::CuMatrixBase const&, eesen::CuMatrixBase const&, eesen::CuMatrixBase) eesen::Layer::Backpropagate(eesen::CuMatrixBase const&, eesen::CuMatrixBase const&, eesen::CuMatrixBase const&, eesen::CuMatrix) eesen::Net::Backpropagate(eesen::CuMatrixBase const&, eesen::CuMatrix) train-ctc-parallel(main+0x10e7) [0x495678] /usr/lib64/libc.so.6(__libc_start_main+0xf5) [0x7fde10ae53d5] train-ctc-parallel() [0x4930a9]

How to fix it? Thank you! Looking forward to your reply~