Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
MIT License
1.18k stars 317 forks source link

Not support mfcc? #68

Open ChangNamAn opened 3 years ago

ChangNamAn commented 3 years ago

Hi,

I'm using the custom speech dataset and the modified asr yaml file. I modified the audio feat_type as 'mfcc' and feat_dim as '39'. After using above changes, I got the error below. Could you help me out for this problem?

Traceback (most recent call last): File "main.py", line 79, in solver.exec() File "/home/ubuntu/End-to-end-ASR-Pytorch_mfcc/bin/train_asr.py", line 106, in exec teacher=txt, get_dec_state=self.emb_reg) File "/home/ubuntu/anaconda3/envs/hear/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, kwargs) File "/home/ubuntu/End-to-end-ASR-Pytorch_mfcc/src/asr.py", line 92, in forward encode_feature, encode_len = self.encoder(audio_feature, feature_len) File "/home/ubuntu/anaconda3/envs/hear/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, *kwargs) File "/home/ubuntu/ChangNam/yonseiAI/End-to-end-ASR-Pytorch_mfcc/src/asr.py", line 365, in forward input_x, enc_len = layer(input_x, enc_len) File "/home/ubuntu/anaconda3/envs/hear/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(input, kwargs) File "/home/ubuntu/End-to-end-ASR-Pytorch_mfcc/src/module.py", line 59, in forward feature, feat_len = self.view_input(feature, feat_len) File "/home/ubuntu/End-to-end-ASR-Pytorch_mfcc/src/module.py", line 52, in view_input feature = feature.view(bs, ts, self.in_channel, self.freq_dim) RuntimeError: shape '[32, 540, 9, 13]' is invalid for input of size 673920

Thanks a lot.