modelscope / FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
https://www.funasr.com
Other
6.97k stars 739 forks source link

Expected size for first two dimensions of batch2 tensor to be: [4, 11] but got: [4, 42] #993

Closed ben-8878 closed 1 month ago

ben-8878 commented 1 year ago

python envs:

funasr==0.7.4
modelscope==1.9.2
torch==1.13.1+cpu

error details is as follow, it happed when multiple clients access the server simultaneously;Also tried funasr==0.7.6 and met same error: res = model(audio_in=data) File "/usr/local/lib/python3.7/dist-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 256, in __call__ output = self.forward(output, **kwargs) File "/usr/local/lib/python3.7/dist-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 505, in forward inputs['asr_result'] = self.run_inference(self.cmd, **kwargs) File "/usr/local/lib/python3.7/dist-packages/modelscope/pipelines/audio/asr_inference_pipeline.py", line 583, in run_inference cmd['param_dict'], **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/bin/asr_inference_launch.py", line 368, in _forward results = speech2text(**batch) File "/usr/local/lib/python3.7/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/bin/asr_infer.py", line 432, in __call__ enc, enc_len = self.asr_model.encode(**batch, ind=self.decoding_ind) File "/usr/local/lib/python3.7/dist-packages/funasr/models/e2e_asr_paraformer.py", line 325, in encode encoder_out, encoder_out_lens, _ = self.encoder(feats, feats_lengths) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/models/encoder/sanm_encoder.py", line 337, in forward encoder_outs = self.encoders(xs_pad, masks) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/modules/repeat.py", line 32, in forward args = m(*args) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/models/encoder/sanm_encoder.py", line 101, in forward self.self_attn(x, mask, mask_shfit_chunk=mask_shfit_chunk, mask_att_chunk_encoder=mask_att_chunk_encoder) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/funasr/modules/attention.py", line 456, in forward att_outs = self.forward_attention(v_h, scores, mask, mask_att_chunk_encoder) File "/usr/local/lib/python3.7/dist-packages/funasr/modules/attention.py", line 431, in forward_attention x = torch.matmul(p_attn, value) # (batch, head, time1, d_k) RuntimeError: Expected size for first two dimensions of batch2 tensor to be: [4, 11] but got: [4, 42].

LauraGPT commented 12 months ago

Please raise a issue as https://github.com/alibaba-damo-academy/FunASR/issues/1073

Simon-chai commented 1 month ago

同样的场景同样的问题。funasr我都升级到1.1.6了,有空我看看最小代码复现一下。